【发布时间】:2020-05-27 16:51:19
【问题描述】:
我有一个像下面这样的 df:
import datetime as dt
import pandas as pd
import pytz
cols = ['utc_datetimes', 'zone_name']
data = [
['2019-11-13 14:41:26,2019-12-18 23:04:12', 'Europe/Stockholm'],
['2019-12-06 21:49:04,2019-12-11 22:52:57,2019-12-18 20:30:58,2019-12-23 18:49:53,2019-12-27 18:34:23,2020-01-07 21:20:51,2020-01-11 17:36:56,2020-01-20 21:45:47,2020-01-30 20:48:49,2020-02-03 21:04:52,2020-02-07 20:05:02,2020-02-10 21:07:21', 'Europe/London']
]
df = pd.DataFrame(data, columns=cols)
print(df)
# utc_datetimes zone_name
# 0 2019-11-13 14:41:26,2019-12-18 23:04:12 Europe/Stockholm
# 1 2019-12-06 21:49:04,2019-12-11 22:52:57,2019-1... Europe/London
我想计算 行的当地时间的晚上和星期三的数量,df 中的日期代表。这是所需的输出:
utc_datetimes zone_name nights wednesdays
0 2019-11-13 14:41:26,2019-12-18 23:04:12 Europe/Stockholm 0 1
1 2019-12-06 21:49:04,2019-12-11 22:52:57,2019-1... Europe/London 11 2
我想出了下面的双 for 循环,但它的效率不如我想要的相当大的 df:
# New columns.
df['nights'] = 0
df['wednesdays'] = 0
for row in range(df.shape[0]):
date_list = df['utc_datetimes'].iloc[row].split(',')
user_time_zone = df['zone_name'].iloc[row]
for date in date_list:
datetime_obj = dt.datetime.strptime(
date, '%Y-%m-%d %H:%M:%S'
).replace(tzinfo=pytz.utc)
local_datetime = datetime_obj.astimezone(pytz.timezone(user_time_zone))
# Get day of the week count:
if local_datetime.weekday() == 2:
df['wednesdays'].iloc[row] += 1
# Get time of the day count:
if (local_datetime.hour >17) & (local_datetime.hour <= 23):
df['nights'].iloc[row] += 1
任何建议将不胜感激:)
PD。忽略“夜晚”的定义,只是一个例子。
【问题讨论】:
-
我的计算可能有误,所以请更好地向我解释一下,对于第二行,星期三数是正确的 - 2。但是,对于第一行,我也得到 2,而不是一个。
-
您需要先更改为当地时间,因此第一行的第二个日期变为'Dec 19th 2019 00:04:12',即星期四。
标签: python pandas python-3.7 python-datetime pytz