【问题标题】:Python - Group by datesPython - 按日期分组
【发布时间】:2021-06-05 10:39:27
【问题描述】:

希望加快这项任务....它的工作原理很慢。

 #split csv file into two groups.
    for index, row in tqdm(df.iterrows(), total=df.shape[0]):
        date_time_obj = datetime.datetime.strptime(row["date"], '%Y-%m-%d')
        if date_time_obj <= datetime.datetime.strptime("2020-03-11", '%Y-%m-%d'):
            group = "before"
        else:
            group = "after"
        df.loc[index, "group"] = group
        df.loc[index, "month"] = date_time_obj.month
    
    ans=[y for x, y in df.groupby('group', as_index=False)]

【问题讨论】:

    标签: python pandas csv date time


    【解决方案1】:

    更快。谢谢。最后我用了:

    df['group'] = tqdm(pd.to_datetime(df['date']) >= pd.to_datetime('2020-03-11')) 
    df.loc[df['group'] == True, 'group'] = "After"
    df.loc[df['group'] == False, 'group'] = "Before"
    df['month'] = pd.to_datetime(df['date']).dt.month
    
    ans=[y for x, y in df.groupby('group', as_index=False)]
    

    【讨论】:

      【解决方案2】:

      为了加快速度,您可以使用矢量化形式(不带iterrows):

      df = pd.DataFrame({'date': pd.date_range('2020-03-08', '2020-03-14')})
      df['group'] = pd.to_datetime(df['date']) <= pd.to_datetime('2020-03-11')
      df['month'] = df['date'].dt.month
      
      df
      

      输出:

              date  group  month
      0 2020-03-08   True      3
      1 2020-03-09   True      3
      2 2020-03-10   True      3
      3 2020-03-11   True      3
      4 2020-03-12  False      3
      5 2020-03-13  False      3
      6 2020-03-14  False      3
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2013-09-07
        • 2020-04-16
        • 2011-02-19
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-01-16
        相关资源
        最近更新 更多