【问题标题】:Pandas get max of sequence熊猫获得最大序列
【发布时间】:2021-02-24 08:38:39
【问题描述】:

我有一个如下所示的数据框:

index                 status    counting
2018-02-11 10:00:00   close     0
2018-02-11 10:00:01   close     0
2018-02-11 10:00:02   close     0
2018-02-11 10:00:03   open      1
2018-02-11 10:00:04   open      2
2018-02-11 10:00:05   open      3
2018-02-11 10:00:06   close     0
2018-02-11 10:00:07   close     0
2018-02-11 10:00:08   close     0
2018-02-11 10:00:09   open      1
2018-02-11 10:00:10   open      2
2018-02-11 10:00:11   open      3
2018-02-11 10:00:12   open      4
2018-02-11 10:00:13   open      5
2018-02-11 10:00:14   open      6
2018-02-11 10:00:15   close     0
2018-02-11 10:00:16   close     0
2018-02-11 10:00:17   close     0

我想计算连续“打开”间隔的平均持续时间。通过

  • 访问这些区间的开始和结束的索引 或
  • 对每个打开的块取最高计数..

谁有想法?

【问题讨论】:

    标签: pandas dataframe time timestamp


    【解决方案1】:

    您可以通过open 比较值,然后通过~ 反转掩码与组的累积总和并仅过滤open 行,最后传递给groupby

    m = df['status'].eq('open')
    
    s = df.groupby((~m).cumsum()[m])['counting'].max()
    print (s)
    status
    3.0    3
    6.0    6
    Name: counting, dtype: int64
    

    如果需要mean:

    s = df.groupby((~m).cumsum()[m])['counting'].mean()
    print (s)
    status
    3.0    2.0
    6.0    3.5
    Name: counting, dtype: float64
    

    如果需要最小值和最大值:

    df = df.reset_index()
    m = df['status'].eq('open')
    
    df1 = df.groupby((~m).cumsum()[m])['index'].agg(['min','max'])
    print (df1)
                            min                  max
    status                                          
    3.0     2018-02-11 10:00:03  2018-02-11 10:00:05
    6.0     2018-02-11 10:00:09  2018-02-11 10:00:14
    

    详情

    print ((~m).cumsum())
    0     1
    1     2
    2     3
    3     3
    4     3
    5     3
    6     4
    7     5
    8     6
    9     6
    10    6
    11    6
    12    6
    13    6
    14    6
    15    7
    16    8
    17    9
    Name: status, dtype: int32
    
    print ((~m).cumsum()[m])
    3     3
    4     3
    5     3
    9     6
    10    6
    11    6
    12    6
    13    6
    14    6
    Name: status, dtype: int32
    

    【讨论】:

      猜你喜欢
      • 2021-09-30
      • 1970-01-01
      • 1970-01-01
      • 2019-09-16
      • 1970-01-01
      • 1970-01-01
      • 2022-01-25
      • 2022-07-25
      • 2020-03-26
      相关资源
      最近更新 更多