【发布时间】:2016-07-24 16:16:27
【问题描述】:
我有一个 csv,其中包含一些我正在读入 pandas 的数据:
filename = sys.argv[1]
data = pd.read_csv(filename, sep=';', header=None)
xy = data
print str(xy)
结果:
0 1
0 label data
1 x 6,8,10,14,18
2 y 7,9,13,17.5,18
3 z 0,0,1,1,1
4 r 2,13,31,33,34,4324,32413,431,666
但是,当我尝试选择一个框架时:
xy = data['2']
xy = data['y']
xy = data['label']
它只是给了我同样的错误:
Traceback (most recent call last):
File "Regress[AA]--[01].py", line 10, in <module>
xy = data['label']
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/indexes/base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)
File "pandas/index.pyx", line 161, in pandas.index.IndexEngine.get_loc (pandas/index.c:4084)
KeyError: 'label'
我应该如何格式化我的选择请求?
编辑:感谢@Merlin 的帮助,我得到了它的工作:
filename = sys.argv[1]
df = pd.read_csv(filename, sep=';')
for i in range(len(df.label)):
a = str(df['label'][i])
b = str(df['data'][i])
print ("Row: {} - Data: {}".format(a,b))
给我:
Row: x - Data: 6,8,10,14,18
Row: y - Data: 7,9,13,17.5,18
Row: z - Data: 0,0,1,1,1
Row: r - Data: 2,13,31,33,34,4324,32413,431,666
【问题讨论】:
-
不要更改默认 header='infer'。试试
pd.read_csv(filename,sep=':') -
必须是这样的:
x,y和z都有 5 个值,但r有 9 个。标头必须是None否则会给我一个错误:ValueError: Some errors were detected ! Line #3 (got 10 columns instead of 6) -
等等,错了:应该是第 4 行。
-
";",逗号用于数组:
x = [6,8,10,14,18]
标签: python csv pandas dataframe row