【发布时间】:2017-06-03 04:17:59
【问题描述】:
我有一个带有此标头的 csv (tsv) 文件
"Message Name" "Field" "Base Label" "Base Label Update Date" "Translated Label" "Translated Label Update Date" "Language"
"Message" "subject_template" "New Task: Assess Distribution Outcomes for ""${docNameNoLink}"", ""${docNumber}""" "8/10/16 4:17:43 PM" "Nouvelle tâche : évaluez le résultat de la distribution de « ${docNameNoLink} »." "2/17/14 5:09:10 AM" "fr"
当我尝试使用此代码读取文件时
import csv
with open(fileName, 'r', encoding='utf-8', errors='replace') as fdata:
csv.register_dialect('tsv', delimiter='\t', quoting=csv.QUOTE_NONE)
reader=csv.reader(fdata, dialect='tsv')
try:
for row in reader:
print (row)
except csv.Error as e:
sys.exit('file{}, line {}: {}'.format(fileName, reader.line_num, e))
我收到消息错误: 文件名文件,第 1 行:行包含 NULL 字节
但是,如果我在没有errors='replace|ignore'部分的情况下运行此代码,则相同的代码:
with open(fileName, 'r', encoding='utf-8') as fdata:
csv.register_dialect('tsv', delimiter='\t', quoting=csv.QUOTE_NONE)
reader=csv.reader(fdata, dialect='tsv')
try:
for row in reader:
print (row)
except csv.Error as e:
sys.exit('file {}, line {}: {}'.format(fileName, reader.line_num, e))
我收到以下消息错误:
File "csvFiles.py", line 76 in <module>
for row in reader:
File "c:\Python35\lib\codecs.py", line 321 in decode (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
这个错误的可能原因是什么?我怎样才能纠正它并使脚本工作?
【问题讨论】:
标签: python-3.x csv unicode