【发布时间】:2016-04-22 03:38:47
【问题描述】:
我有一个 CSV 文件,它使用 '\t' TAB 作为分隔符。它包含 5 列。我试过这个:
import numpy as np
#b=np.loadtxt(r'train_set.csv',dtype=str,delimiter=' ')
my_data = np.genfromtxt('train_set.csv', delimiter='\t')
print my_data
但我收到以下错误:
Traceback (most recent call last):
File "./wordCloud.py", line 7, in <module>
my_data = np.genfromtxt('train_set.csv', delimiter='\t')
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1667, in genfromtxt
raise ValueError(errmsg)
ValueError: Some errors were detected !
Line #14 (got 4 columns instead of 5)
Line #21 (got 4 columns instead of 5)
Line #135 (got 4 columns instead of 5)
有什么想法吗?我不太了解 Python(还 :))!
数据集(我现在也在检查)如下所示:
编辑:
如果我这样做:
my_data = np.genfromtxt('train_set.csv', delimiter=' ')
然后我没有收到任何错误,但输出是:
[ nan nan nan ..., nan nan nan]
答案给出了这些警告:
...
Line #26310 (got 4 columns instead of 5)
Line #26383 (got 4 columns instead of 5)
Line #26448 (got 4 columns instead of 5)
Line #26489 (got 4 columns instead of 5)
Line #26589 (got 4 columns instead of 5)
Line #26593 (got 4 columns instead of 5)
Line #26888 (got 4 columns instead of 5)
Line #27002 (got 4 columns instead of 5)
Line #27065 (got 4 columns instead of 5)
Line #27234 (got 3 columns instead of 5)
Line #27327 (got 4 columns instead of 5)
Line #27418 (got 4 columns instead of 5)
Line #27594 (got 4 columns instead of 5)
Line #27827 (got 4 columns instead of 5)
Line #27944 (got 4 columns instead of 5)
Line #28074 (got 4 columns instead of 5)
Line #28102 (got 4 columns instead of 5)
Line #28147 (got 4 columns instead of 5)
Line #28224 (got 4 columns instead of 5)
Line #28264 (got 4 columns instead of 5)
Line #28344 (got 4 columns instead of 5)
Line #28484 (got 4 columns instead of 5)
warnings.warn(errmsg, ConversionWarning)
输出会出现一些奇怪的字符,例如:
costing at least \xc2\xa3429
代替costing at least £429。
【问题讨论】: