【问题标题】:deleting rows depending on the datetime in date column in csv file in python根据python中csv文件中日期列中的日期时间删除行
【发布时间】:2016-02-22 21:23:48
【问题描述】:

我是新来的,你好。 我试图找到类似的问题,但我没有。所以也许它会对像我这样没有经验的程序员有所帮助。

我有这种结构的 CSV 文件:

This is the list of workers.
Company blablabla.
name^position^start_date
John^manager^2015-01-01 08:00:00.0
Mary^supervisor^2014-10-01 09:00:00.0
Lucas^worker^2013-01-01 12:00:00.0
etc...

我需要脚本来: - 删除前三行,因为不需要, - 询问用户的开始日期, - 然后脚本将删除所有具有“较早或相等日期”的行,然后用户在上一步中被要求 - 最后只留下名称(1 列)并将其保存到同一个 csv 文件中。

到目前为止我想出什么:

删除第 1,2 和 3 行:

import os

directory = ('C:/TEMP/')
os.chdir( directory )

FIRST_ROW_NUM = 1  # or 0
ROWS_TO_DELETE = {1, 2, 3}

with open( directory + 'FILE.csv', 'rt') as infile, open('FILE-NEW.csv', 'wt') as outfile:
    outfile.writelines(row for row_num, row in enumerate(infile, FIRST_ROW_NUM)
                        if row_num not in ROWS_TO_DELETE)

读取CSV文件,定界和排序

import csv
from datetime import datetime

f = open('FILE-NEW.csv')
csv_f = csv.reader(f,delimiter='^')
csv_f = sorted(csv_f, key = lambda row: datetime.strptime(row[2], "%Y-%m-%d %H:%M:%S.%f"))

现在我卡住了,我需要向用户询问日期,但即使我将添加一个带有静态日期的变量,我应该如何使用它与日期列进行比较以删除日期较旧的行? 谢谢你的帮助。 问候

【问题讨论】:

  • 欢迎来到 SO! “如果 row_num 不在 ROWS_TO_DELETE 中”很难看。这只是跳过这些行所需的顺序计数,而不是任何跨产品搜索。

标签: python date csv datetime


【解决方案1】:

基本思想是准确的,但您没有利用顺序。

当然,你不需要生成数据文件;我为想要测试此代码的读者这样做;您可能还想在处理外部文件时添加try-except

import os
import sys
import uuid
import csv
from datetime import datetime

CONTENT = """This is the list of workers.
Company blablabla.
name^position^start_date
John^manager^2015-01-01 08:00:00.0
Mary^supervisor^2014-10-01 09:00:00.0
Lucas^worker^2013-01-01 12:00:00.0"""

DT_FORMAT = "%Y-%m-%d %H:%M:%S.%f"

FIRST_LINE = 3
START_DATE = datetime.strptime("2014-10-01 09:00:00.0", DT_FORMAT)


IN_FILENAME = str(uuid.uuid4()) + '.txt'
OUT_FILENAME = 'FILE-NEW.csv'

if __name__ == '__main__':
    with open(IN_FILENAME, 'w') as outfile:
        outfile.write(CONTENT)

    with open(IN_FILENAME, 'r') as infile:
        reader = csv.reader(infile.readlines()[FIRST_LINE:],
                            delimiter='^')
        reader = sorted(reader,
                        key=lambda row: datetime.strptime(
                            row[2], DT_FORMAT
                        ))

    with open(OUT_FILENAME, 'w') as outfile:
        writer = csv.writer(outfile, delimiter='^')
        # more efficient than filter because reader is already
        # prooperly sorted
        for row in reader:
            if datetime.strptime(row[2], DT_FORMAT) < START_DATE:
                continue
            writer.writerow(row)

    with open('FILE-NEW.csv', 'r') as infile:
        for line in infile:
            print(line[:-1])

输出:

Mary^supervisor^2014-10-01 09:00:00.0
John^manager^2015-01-01 08:00:00.0

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2015-04-22
    • 2020-09-03
    • 1970-01-01
    • 2017-12-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多