【问题标题】:Parsing timestamp using csv module and datetime module使用 csv 模块和 datetime 模块解析时间戳
【发布时间】:2020-08-15 23:25:54
【问题描述】:

我在使用 Python 中的 datetime 模块时遇到了一些问题。我有来自 csv 文件的数据:

user_id,timestamp
563,0:00:21
671,0:00:26
780,0:00:28

这是我的代码:

import csv
from datetime import datetime

path = "/home/haldrik/dev/python/data/dataset.csv"
file = open(path, newline='')

reader = csv.reader(file, delimiter=',')

header = next(reader) # Ignore first row.

data = []
for row in reader:
    # row = [user_id, timestamp]
    user_id = row[0]
    timestamp = datetime.strptime(row[1], '%H:%M:%S').time()
    
    data.append([user_id, timestamp])

该代码抛出此错误:

Traceback (most recent call last):
  File "/home/haldrik/dev/python/instances_web_site.py", line 15, in <module>
    date = datetime.strptime(row[1], '%H:%M:%S').time()
  File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '' does not match format '%H:%M:%S'

我找不到错误在哪里。我可以看到数据格式符合指定的时间格式。

将cvs导入步骤倒出,我可以确保它工作,看这个sn-p的代码(不包含在上面的代码中):

data_import = [row for row in reader]
print(data_import[0])

它输出这个:

['563','0:00:21']

【问题讨论】:

    标签: python csv parsing time


    【解决方案1】:
    • 时间戳列中的一个或多个值存在问题,其中一行看起来像440,,将导致time data '' does not match format '%H:%M:%S'
    • date = datetime.strptime(row[1], '%H:%M:%S').time() 包装在try-except 块中。

    test.csv

    user_id,timestamp
    563,0:00:21
    671,0:00:26
    780,0:00:28
    440,
    

    代码

    import csv
    from datetime import datetime
    
    path = "test.csv"
    file = open(path, newline='')
    
    reader = csv.reader(file, delimiter=',')
    
    header = next(reader) # Ignore first row.
    
    data = []
    for row in reader:
        # row = [user_id, timestamp]
        user_id = row[0]
        try:
            timestamp = datetime.strptime(row[1], '%H:%M:%S').time()
        except ValueError as e:
            timestamp = row[1]
    #         continue  # use this if you do not want the row added to data, comment it out otherwise
        
        data.append([user_id, timestamp])
    
    
    print(data)
    [out]:
    [['563', datetime.time(0, 0, 21)], ['671', datetime.time(0, 0, 26)], ['780', datetime.time(0, 0, 28)], ['440', '']]
    

    【讨论】:

    • 你是对的,csv文件充满了空数据,文件末尾只有逗号('','')。
    猜你喜欢
    • 1970-01-01
    • 2018-03-04
    • 1970-01-01
    • 2021-07-25
    • 1970-01-01
    • 1970-01-01
    • 2012-06-19
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多