Python - 按日期分组

【问题标题】：Python - Group by datesPython - 按日期分组
【发布时间】：2021-06-05 10:39:27
【问题描述】：

希望加快这项任务....它的工作原理很慢。

 #split csv file into two groups.
    for index, row in tqdm(df.iterrows(), total=df.shape[0]):
        date_time_obj = datetime.datetime.strptime(row["date"], '%Y-%m-%d')
        if date_time_obj <= datetime.datetime.strptime("2020-03-11", '%Y-%m-%d'):
            group = "before"
        else:
            group = "after"
        df.loc[index, "group"] = group
        df.loc[index, "month"] = date_time_obj.month
    
    ans=[y for x, y in df.groupby('group', as_index=False)]

【问题讨论】：

标签： python pandas csv date time

【解决方案1】：

更快。谢谢。最后我用了：

df['group'] = tqdm(pd.to_datetime(df['date']) >= pd.to_datetime('2020-03-11')) 
df.loc[df['group'] == True, 'group'] = "After"
df.loc[df['group'] == False, 'group'] = "Before"
df['month'] = pd.to_datetime(df['date']).dt.month

ans=[y for x, y in df.groupby('group', as_index=False)]

【讨论】：

【解决方案2】：

为了加快速度，您可以使用矢量化形式（不带iterrows）：

df = pd.DataFrame({'date': pd.date_range('2020-03-08', '2020-03-14')})
df['group'] = pd.to_datetime(df['date']) <= pd.to_datetime('2020-03-11')
df['month'] = df['date'].dt.month

df

输出：

        date  group  month
0 2020-03-08   True      3
1 2020-03-09   True      3
2 2020-03-10   True      3
3 2020-03-11   True      3
4 2020-03-12  False      3
5 2020-03-13  False      3
6 2020-03-14  False      3

【讨论】：