【问题标题】:How to read a specific column of csv file using python如何使用python读取csv文件的特定列
【发布时间】:2015-10-25 23:07:20
【问题描述】:

我是 Scikit-Learn 的新手,我想将已标记的数据集合转换为数据集。我已将数据的 .csv 文件转换为 NumPy 数组,但是我遇到的一个问题是根据第二列中是否存在标志将数据分类为训练集。我想知道如何使用 Pandas 实用模块访问 .csv 文件的特定行、列。以下是我的代码:

    import numpy as np
    import pandas as pd
    import csv
    import nltk
    import pickle
    from nltk.classify.scikitlearn import SklearnClassifier
    from sklearn.naive_bayes import MultinomialNB,BernoulliNB
    from nltk.classify import ClassifierI
    from statistics import mode




    def numpyfy(fileid):
         data = pd.read_csv(fileid,encoding = 'latin1')
         #pd.readline(data)
         target = data["String"]
         data1 = data.ix[1:,:-1]
         #print(data)
         return data1
    def learn(fileid):
         trainingsetpos = []
         trainingsetneg = []
         datanew = numpyfy(fileid)
         if(datanew.ix['Status']==1):
            trainingsetpos.append(datanew.ix['String'])
         if(datanew.ix['Status']==0):
            trainingsetneg.append(datanew.ix['String'])

    print(list(trainingsetpos))

【问题讨论】:

标签: python csv numpy scikit-learn


【解决方案1】:

您可以使用boolean indexing 来拆分数据。类似的东西

import pandas as pd


def numpyfy(fileid):
    df = pd.read_csv(fileid, encoding='latin1')
    target = df.pop('String')
    data = df.ix[1:,:-1]
    return target, data


def learn(fileid):
    target, data = numpyfy(fileid)
    trainingsetpos = data[data['Status'] == 1]
    trainingsetneg = data[data['Status'] == 0]

    print(trainingsetpos)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2013-12-02
    • 1970-01-01
    • 2019-10-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-09-14
    相关资源
    最近更新 更多