【问题标题】:Last 10 minutes per day每天最后10分钟
【发布时间】:2016-09-28 14:14:14
【问题描述】:

我正在尝试获取每天最后每 10 分钟完成了多少业务的交易量

我的数据如下:

DF_Q

Out[97]: 
LongTime
2016-01-04 09:30:00     35077034
2016-01-04 09:30:11         1119
2016-01-04 09:30:21     12295250
2016-01-04 09:30:23      1387856
2016-01-04 09:30:40       877954
...
2016-05-27 15:59:53        16986
2016-05-27 15:59:58     50080165
2016-05-27 15:59:59     17097260
Name: Volume, dtype: int64

我首先将该系列重新采样为 10 分钟间隔,然后我得到:

DF_Qmin = DF_Q.resample('10min').sum()

DF_Qmin
Out[102]: 
LongTime
2016-01-04 09:30:00    3.202500e+05
2016-01-04 09:40:00    1.192028e+08
2016-01-04 09:50:00    6.156090e+07
2016-01-04 10:00:00    1.289250e+09
...
2016-05-27 15:20:00    1.035539e+09
2016-05-27 15:30:00    1.489631e+09
2016-05-27 15:40:00    2.228257e+09
2016-05-27 15:50:00    5.352179e+09
Freq: 10T, Name: Volume, dtype: float64

然后我做一个数据透视表

,我将其保存为 Excel 并手动获取每天最后 10 分钟的音量

2016-01-04 16:50:00 3.693279e+09
2016-01-05 16:50:00 2.158429e+09
...
2016-05-26 15:50:00 1.256878e+08
2016-05-27 15:50:00 6.521489e+09

没有excel也能做到这一点吗?还是每天迭代?

【问题讨论】:

    标签: python pandas group-by resampling days


    【解决方案1】:

    我认为您需要 groupby by date 和聚合 last。最后rename_axispandas0.18.0 中的新功能)和reset_index

    #if need column LongTime
    DF_Qmin = DF_Qmin.reset_index()
    
    print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date).last())
    

    示例:

    import pandas as pd
    
    DF_Qmin = pd.Series({pd.Timestamp('2016-01-04 09:30:00'): 320250.0, pd.Timestamp('2016-01-04 09:50:00'): 61560900.0, pd.Timestamp('2016-05-27 15:40:00'): 2228257000.0, pd.Timestamp('2016-01-04 09:40:00'): 119202800.0, pd.Timestamp('2016-05-27 15:30:00'): 1489631000.0, pd.Timestamp('2016-01-04 10:00:00'): 1289250000.0, pd.Timestamp('2016-05-27 15:50:00'): 5352179000.0, pd.Timestamp('2016-05-27 15:20:00'): 1035539000.0}, name='Volume')
    DF_Qmin.index.name = 'LongTime'
    print (DF_Qmin)
    LongTime
    2016-01-04 09:30:00    3.202500e+05
    2016-01-04 09:40:00    1.192028e+08
    2016-01-04 09:50:00    6.156090e+07
    2016-01-04 10:00:00    1.289250e+09
    2016-05-27 15:20:00    1.035539e+09
    2016-05-27 15:30:00    1.489631e+09
    2016-05-27 15:40:00    2.228257e+09
    2016-05-27 15:50:00    5.352179e+09
    Name: Volume, dtype: float64
    
    DF_Qmin = DF_Qmin.reset_index()
    print (DF_Qmin)
                 LongTime        Volume
    0 2016-01-04 09:30:00  3.202500e+05
    1 2016-01-04 09:40:00  1.192028e+08
    2 2016-01-04 09:50:00  6.156090e+07
    3 2016-01-04 10:00:00  1.289250e+09
    4 2016-05-27 15:20:00  1.035539e+09
    5 2016-05-27 15:30:00  1.489631e+09
    6 2016-05-27 15:40:00  2.228257e+09
    7 2016-05-27 15:50:00  5.352179e+09
    
    print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date)
                  .last()
                  .rename_axis('Date')
                  .reset_index())
    
             Date            LongTime        Volume
    0  2016-01-04 2016-01-04 10:00:00  1.289250e+09
    1  2016-05-27 2016-05-27 15:50:00  5.352179e+09
    

    如果不需要最后一次:

    print (DF_Qmin.groupby(DF_Qmin.index.date)
                  .last()
                  .rename_axis('Date')
                  .reset_index())
             Date        Volume
    0  2016-01-04  1.289250e+09
    1  2016-05-27  5.352179e+09
    

    【讨论】:

      【解决方案2】:

      重新采样您的 Series/DF 后,您可以这样做:

      DF_Qmin.ix[DF_Qmin.index.minute == 50]
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2023-03-16
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2022-01-11
        • 1970-01-01
        • 2014-05-06
        • 1970-01-01
        相关资源
        最近更新 更多