【发布时间】:2018-04-12 05:38:09
【问题描述】:
我有一个 IoT 设备运行了 18 个月,并且有很多数据需要分析。该设备已在不同时间打开和关闭,我想使用具有以下格式的时间戳计算它何时打开,并且每个样本都以一分钟的间隔进行:
08-01-01 10:00
08-01-01 10:01
08-01-01 10:00
08-01-02 03:10
08-01-02 03:11
理想情况下,我希望生成以下格式的报告:
Time session 1 - 08-01-01 10:00 08-01-01 10:02 Session 1 ran for three minutes
Time session 2 - 08-01-02 02:10 08-01-02 03:11 Session 2 ran for 2 minutes
问题是我有超过 150k 的时间戳并且想不出一种快速的方法来对数据进行排序,目前我正在使用另一个数组,它是从第一个时间戳到最后一个时间戳的完整时间戳。然后将原始时间戳数组与主时间戳进行比较,然后设置一个标记。它工作但效率不高,并试图想出一种更好的方法来分析这些数据。
import csv
from datetime import date, datetime, timedelta
with open('HomeOfficeApr.csv', 'rU') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
orgtimestamp = []
for row in readCSV:
ts = row[0]
orgtimestamp.append(ts)
for elements in range(len(orgtimestamp)):
orgtimestamp[elements]=orgtimestamp[elements][:-9]
# print(timestamp[elements])
print("First time stamp")
print(orgtimestamp[0])
print("Create time stamp range")
def datetime_range(start, end, delta):
current = start
if not isinstance(delta, timedelta):
delta = timedelta(**delta)
while current < end:
yield current
current += delta
#Timestamps hard coded - need to change to first and last timestamp
start = datetime(2017,04,13, 8, 30)
end = datetime(2018,12,31, 12, 0)
gentimestamp = []
#this unlocks the following interface:
for dt in datetime_range(start, end, {'days':0, 'minutes':1}):
gentimestamp.append(str(dt))
for i in range(len(gentimestamp)):
gentimestamp[i]=gentimestamp[i][:-3]
print("Compare time stamp")
print(len(gentimestamp))
CompareTimeStampArray = [None] * len(gentimestamp)
for i in range(len(CompareTimeStampArray)):
CompareTimeStampArray[i] = "Y"
for i in range(len(orgtimestamp)):
for y in range(len(gentimestamp)):
if (orgtimestamp[i][0:4]) == (gentimestamp[y][0:4]):
#print("Match year")
#print(orgtimestamp[i][0:4])
#print(gentimestamp[y][0:4])
if (orgtimestamp[i][5:7]) == (gentimestamp[y][5:7]):
#print("Match month")
#print(orgtimestamp[i][5:7])
#print(gentimestamp[y][5:7])
if (orgtimestamp[i][8:10]) == (gentimestamp[y][8:10]):
#print("Match day")
#print(orgtimestamp[i][8:10])
#print(gentimestamp[y][8:10])
if (orgtimestamp[i][11:13]) == (gentimestamp[y][11:13]):
#print("Match hour")
#print(orgtimestamp[i][11:13])
#print(gentimestamp[y][11:13])
if (orgtimestamp[i][14:16]) == (gentimestamp[y][14:16]):
print("Match second")
print("Date & time match")
print(orgtimestamp[i])
print(gentimestamp[y])
print[i]
print[y]
print("")
CompareTimeStampArray[i] = "X"
break
print("Finished")
【问题讨论】:
-
在编辑之前没有“Timesession x”列会使事情变得更加困难。另外,日志中真的有重复的条目吗?