将时间四舍五入到最接近的秒数 - Python答案

【问题标题】：Rounding time off to the nearest second - Python将时间四舍五入到最接近的秒数 - Python
【发布时间】：2018-05-27 07:07:22
【问题描述】：

我有一个包含超过 500 000 个日期和时间戳的大型数据集，如下所示：

date        time
2017-06-25 00:31:53.993
2017-06-25 00:32:31.224
2017-06-25 00:33:11.223
2017-06-25 00:33:53.876
2017-06-25 00:34:31.219
2017-06-25 00:35:12.634

如何将这些时间戳四舍五入到最接近的秒数？

我的代码如下所示：

readcsv = pd.read_csv(filename)
log_date = readcsv.date
log_time = readcsv.time

readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time
timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]

所以现在我将日期和时间组合成一个 datetime.datetime 对象列表，如下所示：

datetime.datetime(2017,6,25,00,31,53,993000)
datetime.datetime(2017,6,25,00,32,31,224000)
datetime.datetime(2017,6,25,00,33,11,223000)
datetime.datetime(2017,6,25,00,33,53,876000)
datetime.datetime(2017,6,25,00,34,31,219000)
datetime.datetime(2017,6,25,00,35,12,634000)

我从这里去哪里？ df.timestamp.dt.round('1s') 功能似乎不起作用？此外，当使用.split() 时，我在秒数和分钟数超过 59 时遇到问题

非常感谢

【问题讨论】：

请发布所需的输出以获取时间戳。
stackoverflow.com/questions/3463930/… 使用 1*60 作为参数。
你在用熊猫吗？
你的数据集是由什么组成的？熊猫数据框，CSV 文件？您有什么尝试解决问题的方法吗？
arrow.readthedocs.io/en/latest

标签： python python-2.7 pandas datetime

【解决方案1】：

这是一个简单的解决方案，可以正确地向上和向下舍入并且不使用任何字符串黑客：

from datetime import datetime, timedelta

def round_to_secs(dt: datetime) -> datetime:
    extra_sec = round(dt.microsecond / 10 ** 6)
    return dt.replace(microsecond=0) + timedelta(seconds=extra_sec)

一些例子：

now = datetime.now()
print(now)                 # 2021-07-26 10:43:54.397538
print(round_to_secs(now))  # 2021-07-26 10:43:54 -- rounded down

now = datetime.now()
print(now)                 # 2021-07-26 10:44:59.787438
print(round_to_secs(now))  # 2021-07-26 10:45:00  -- rounded up taking into account secs and minutes

【讨论】：

【解决方案2】：

这样做的另一种方式：

不涉及字符串操作
使用Python内置的round
不会改变原来的时间增量，而是提供一个新的时间增量
是一个单线 :)

import datetime

original = datetime.timedelta(seconds=50, milliseconds=20)
rounded = datetime.timedelta(seconds=round(original.total_seconds()))

【讨论】：

【解决方案3】：

无需任何额外的包，日期时间对象可以通过以下简单函数四舍五入到最接近的秒数：

import datetime as dt

def round_seconds(obj: dt.datetime) -> dt.datetime:
    if obj.microsecond >= 500_000:
        obj += dt.timedelta(seconds=1)
    return obj.replace(microsecond=0)

【讨论】：

这里的小改进：为了满足 python 的命名约定，将 roundSeconds 重命名为 round_seconds 并相应地所有其他驼峰命名。除此之外，很好的答案！
由于浮点精度，此函数会留下少量微秒。 @Maciejo95 的回答没有这个问题。
小心：这个函数正在改变给定的对象。
@electrovir 我错了。该函数确实没有改变对象。函数的编写方式让我有点吃惊，因为new_date_time = date_time_object 不会创建副本。您可以对给定对象执行所有操作，它仍然可以工作。那是因为 datetime 对象是不可变的。
@ErikKalkoken 明白了，这是有道理的。随意编辑答案以使其更好地阅读！ Python（连同它的所有约定）不是我最擅长的语言之一。

【解决方案4】：

我需要它，所以我将 @srisaila 调整为 60 秒/分钟。非常复杂的样式，但是基本的功能。

def round_seconds(dts):
    result = []
    for item in dts:
        date = item.split()[0]
        h, m, s = [item.split()[1].split(':')[0],
                   item.split()[1].split(':')[1],
                   str(round(float(item.split()[1].split(':')[-1])))]
        if len(s) == 1:
            s = '0'+s
        if int(s) == 60:
            m_tmp = int(m)
            m_tmp += 1
            m = str(m_tmp)
            if(len(m)) == 1:
                m = '0'+ m
            s = '00'
        if m == 60:
            h_tmp = int(h)
            h_tmp += 1
            h = str(h_tmp)
            if(len(h)) == 1:
                print(h)
                h = '0'+ h
            m = '00'
        result.append(date + ' ' + h + ':' + m + ':' + s)
    return result

【讨论】：

【解决方案5】：

一个优雅的解决方案，只需要标准的日期时间模块。

import datetime

            currentimemili = datetime.datetime.now()
            currenttimesecs = currentimemili - \
                datetime.timedelta(microseconds=currentimemili.microsecond)
            print(currenttimesecs)

【讨论】：

【解决方案6】：

如果有人想将单个日期时间项四舍五入到最接近的秒，这个就可以了：

pandas.to_datetime(your_datetime_item).round('1s')

【讨论】：

也适用于 pandas 列数据。我觉得这是最好的答案。
这个解决方案改变了输出类型（datetime.datetime -> pd.Timestamp），并不总是需要的

【解决方案7】：

@electrovir 解决方案的替代版本：

import datetime

def roundSeconds(dateTimeObject):
    newDateTime = dateTimeObject + datetime.timedelta(seconds=.5)
    return newDateTime.replace(microsecond=0)

【讨论】：

【解决方案8】：

这个问题并没有说如何你想四舍五入。向下舍入通常适用于时间函数。这不是统计数据。

rounded_down_datetime = raw_datetime.replace(microsecond=0)

【讨论】：

【解决方案9】：

如果你使用的是 pandas，你可以使用 dt.round round 将数据精确到秒 -

df

                timestamp
0 2017-06-25 00:31:53.993
1 2017-06-25 00:32:31.224
2 2017-06-25 00:33:11.223
3 2017-06-25 00:33:53.876
4 2017-06-25 00:34:31.219
5 2017-06-25 00:35:12.634

df.timestamp.dt.round('1s')

0   2017-06-25 00:31:54
1   2017-06-25 00:32:31
2   2017-06-25 00:33:11
3   2017-06-25 00:33:54
4   2017-06-25 00:34:31
5   2017-06-25 00:35:13
Name: timestamp, dtype: datetime64[ns]

如果timestamp 不是datetime 列，请先转换它，使用pd.to_datetime -

df.timestamp = pd.to_datetime(df.timestamp)

那么，dt.round 应该可以工作了。

【讨论】：

非常感谢，我不敢相信我没有想到这一点。

【解决方案10】：

如果您将数据集存储到文件中，您可以这样做：

with open('../dataset.txt') as fp:
    line = fp.readline()
    cnt = 1
    while line:
        line = fp.readline()
        print "\n" + line.strip()
        sec = line[line.rfind(':') + 1:len(line)]
        rounded_num = int(round(float(sec)))
        print line[0:line.rfind(':') + 1] + str(rounded_num)
        print abs(float(sec) - rounded_num)
        cnt += 1

如果您将数据集存储在列表中：

dts = ['2017-06-25 00:31:53.993',
   '2017-06-25 00:32:31.224',
   '2017-06-25 00:33:11.223',
   '2017-06-25 00:33:53.876',
   '2017-06-25 00:34:31.219',
   '2017-06-25 00:35:12.634']

for i in dts:
    line = i
    print "\n" + line.strip()
    sec = line[line.rfind(':') + 1:len(line)]
    rounded_num = int(round(float(sec)))
    print line[0:line.rfind(':') + 1] + str(rounded_num)
    print abs(float(sec) - rounded_num)

【讨论】：

【解决方案11】：

使用for loop 和str.split()：

dts = ['2017-06-25 00:31:53.993',
       '2017-06-25 00:32:31.224',
       '2017-06-25 00:33:11.223',
       '2017-06-25 00:33:53.876',
       '2017-06-25 00:34:31.219',
       '2017-06-25 00:35:12.634']

for item in dts:
    date = item.split()[0]
    h, m, s = [item.split()[1].split(':')[0],
               item.split()[1].split(':')[1],
               str(round(float(item.split()[1].split(':')[-1])))]

    print(date + ' ' + h + ':' + m + ':' + s)

2017-06-25 00:31:54
2017-06-25 00:32:31
2017-06-25 00:33:11
2017-06-25 00:33:54
2017-06-25 00:34:31
2017-06-25 00:35:13
>>>

你可以把它变成一个函数：

def round_seconds(dts):
    result = []
    for item in dts:
        date = item.split()[0]
        h, m, s = [item.split()[1].split(':')[0],
                   item.split()[1].split(':')[1],
                   str(round(float(item.split()[1].split(':')[-1])))]
        result.append(date + ' ' + h + ':' + m + ':' + s)

    return result

测试功能：

dts = ['2017-06-25 00:31:53.993',
       '2017-06-25 00:32:31.224',
       '2017-06-25 00:33:11.223',
       '2017-06-25 00:33:53.876',
       '2017-06-25 00:34:31.219',
       '2017-06-25 00:35:12.634']

from pprint import pprint

pprint(round_seconds(dts))

['2017-06-25 00:31:54',
 '2017-06-25 00:32:31',
 '2017-06-25 00:33:11',
 '2017-06-25 00:33:54',
 '2017-06-25 00:34:31',
 '2017-06-25 00:35:13']
>>>

由于您似乎使用的是 Python 2.7，因此要删除任何尾随零，您可能需要进行更改：

str(round(float(item.split()[1].split(':')[-1])))

到

str(round(float(item.split()[1].split(':')[-1]))).rstrip('0').rstrip('.')

我刚刚在repl.it 使用 Python 2.7 尝试了该函数，它按预期运行。

【讨论】：

我相信这对于边缘情况'2017-06-25 00:31:59.993'会失败。
这仅在四舍五入时有效。将秒数向上取整不会更新分钟数。