比较两个文本文件并找到共同的时间戳答案

【问题标题】：compare two text file and find common timestamp比较两个文本文件并找到共同的时间戳
【发布时间】：2019-08-30 07:43:24
【问题描述】：

我有两个文本文件。两者都像下面有很多原料：

2014-09-06 12:18:29  0 7Z
2014-09-06 21:00:41  0 7Z
2014-09-06 02:28:06  0 7Z
2014-09-06 13:06:53  0 7Z

我想比较这两个文件并制作第二列的新文件，其中第二列在两个文件中相似。但关键是第二列是时间戳，我想计算两个文件中的列，当它们相似或最大相差 5 秒时。例如，对于我上面示例中的第一个原始数据，如果在另一个文件中我们的第二列在此范围内：“12:18:29 到 12:18:34”，则该原始数据将被视为相似。

我阅读第一个文件是这样的：

f= open ('green.txt','r')
f= open ('red.txt','r')
with open ('common', 'w') as h:
    for line in f:
        elements = line.split (' ')
        data = elements [1]

但是因为我想比较时间戳我不知道该怎么做。在我的代码中，数据将是字符串。

【问题讨论】：

不要自己解析文件 - 尝试使用 pandas：pandas.pydata.org/pandas-docs/stable/reference/api/… 这样数据操作也会更容易。
阅读datetime.strptime和Supported operations:部分

标签： python text timestamp

【解决方案1】：

我花了一些时间才弄清楚这一点，我们开始吧。

所以基本上我所做的就是打开这两个文件，然后将第一个文件中的所有行存储到一个列表中，以便我们可以将其与第二个文件中的行进行比较。
我使用 datetime 库来提取时间戳（使用 datetime.strptime() 并进行异常处理，因为时间戳由一些文本跟踪)
然后得到两个datetime对象之间的绝对差，以它为条件，只要差
李>

我在文本文件中做了一些更改以说明输出

绿色.txt

2014-09-06 12:18:29  0 7Z
2014-09-06 21:00:41  0 7Z
2014-09-06 02:28:06  0 7Z
2014-09-06 13:06:53  0 7Z

red.txt

2014-09-06 12:18:25  0 7Z
2014-09-06 21:00:50  0 7Z
2014-09-06 02:28:23  0 7Z
2014-09-06 13:06:58  0 7Z

输出将是文件'common.txt'

12:18:29
13:06:53

代码如下：

from datetime import datetime

f1= open ('green.txt','r')
f2= open ('red.txt','r')
f3=open('common.txt','w')


fmt="%Y-%m-%d %H:%M:%S"  #timestamp format


def parse_datetime(line,fmt):                #this really important function is used to parse timestamp with date from a string
    try:
        t = datetime.strptime(line, fmt)
    except ValueError as v:
        if len(v.args) > 0 and v.args[0].startswith('unconverted data remains: '):
            line = line[:-(len(v.args[0]) - 26)]
            t = datetime.strptime(line, fmt)
        else:
            raise
    return t


times1 = []

for linef1 in f1:           #this part stores lines from green.txt to a list 
    times1.append(parse_datetime(linef1,fmt))  

for linef2 in f2:
    t2 = parse_datetime(linef2,fmt)   

    for t1 in times1:               #this loop compares the first file with the second
        diff= abs(t1-t2)            #  finding the time difference
        if diff.seconds <=5:        # condition check if the difference is less than 5 seconds
            f3.write(t1.strftime("%H:%M:%S")+'\n')


f1.close()
f2.close()
f3.close()

【讨论】：