【问题标题】:pandas: add new column based on datetime index lookup of same dataframepandas:根据相同数据框的日期时间索引查找添加新列
【发布时间】:2022-01-22 13:47:14
【问题描述】:

我有以下数据,我想在其中添加一个新列,即当前月环比变化百分比。日期是我数据框中的索引

    date    close
1/26/1990   421.2999878
1/29/1990   418.1000061
1/30/1990   410.7000122
1/31/1990   415.7999878
2/23/1990   419.5
2/26/1990   421
2/27/1990   422.6000061
2/28/1990   425.7999878
3/26/1990   438.7999878
3/27/1990   439.5
3/28/1990   436.7000122
3/29/1990   435.3999939
3/30/1990   435.5

我能想到的最简单的方法是添加一个列,该列将包含上个月的结束日期以及为方便起见,上一个月末的“关闭”值 - 从中​​我可以计算当前月份 -月变化。所以最后,我会有一个如下所示的表格:

我能够很好地添加上个月末,但我现在在尝试根据上个月结束日期查找上个月末收盘时遇到问题。在下面的代码中,第一行可以正常添加上个月的结束日期作为新列。但第二个没有 - 想法是使用 prev_month_end 日期来查找月末收盘值并将其添加为列。

df['prev_month_end'] = df.index + pd.offsets.BMonthEnd(-1)
df['prev_month_close'] = df[df.index == df['prev_month_end']]

如能提供任何帮助或建议,我们将不胜感激。

【问题讨论】:

    标签: python pandas numpy


    【解决方案1】:

    你可以有prev_month_close如下:

    df.reset_index(inplace=True)
    df = df[['date', 'close', 'prev_month_end']].merge(df[['date', 'close']].rename(columns={'close': 'prev_month_close',
                                                                                             'date': 'prev_month_end'}),
                                                        how='left', on='prev_month_end')
    

    OUTPUT

                 date       close prev_month_end  prev_month_close
        0  1990-01-26  421.299988     1989-12-29               NaN
        1  1990-01-29  418.100006     1989-12-29               NaN
        2  1990-01-30  410.700012     1989-12-29               NaN
        3  1990-01-31  415.799988     1989-12-29               NaN
        4  1990-02-23  419.500000     1990-01-31        415.799988
        5  1990-02-26  421.000000     1990-01-31        415.799988
        6  1990-02-27  422.600006     1990-01-31        415.799988
        7  1990-02-28  425.799988     1990-01-31        415.799988
        8  1990-03-26  438.799988     1990-02-28        425.799988
        9  1990-03-27  439.500000     1990-02-28        425.799988
        10 1990-03-28  436.700012     1990-02-28        425.799988
        11 1990-03-29  435.399994     1990-02-28        425.799988
        12 1990-03-30  435.500000     1990-02-28        425.799988
    

    或者不使用reset_index

    df = df[['close', 'prev_month_end']].merge(df[['close']].rename(columns={'close': 'prev_month_close'}),
                                                        how='left', left_on='prev_month_end', right_index=True)
    

    OUTPUT

                     close prev_month_end  prev_month_close
    date                                                   
    1990-01-26  421.299988     1989-12-29               NaN
    1990-01-29  418.100006     1989-12-29               NaN
    1990-01-30  410.700012     1989-12-29               NaN
    1990-01-31  415.799988     1989-12-29               NaN
    1990-02-23  419.500000     1990-01-31        415.799988
    1990-02-26  421.000000     1990-01-31        415.799988
    1990-02-27  422.600006     1990-01-31        415.799988
    1990-02-28  425.799988     1990-01-31        415.799988
    1990-03-26  438.799988     1990-02-28        425.799988
    1990-03-27  439.500000     1990-02-28        425.799988
    1990-03-28  436.700012     1990-02-28        425.799988
    1990-03-29  435.399994     1990-02-28        425.799988
    1990-03-30  435.500000     1990-02-28        425.799988
                 
    

    【讨论】:

    • 所以在这种情况下,我们必须将它们视为两个不同的数据帧,然后合并 - 对吗?
    【解决方案2】:

    我们可以将索引转换为period index,然后将group按周期转换为数据帧,并使用last聚合close,然后将shift一个月前的周期索引和map与收盘值合并,最后计算百分比变化

    i = pd.to_datetime(df.index).to_period('M')
    s = i.shift(-1).map(df.groupby(i)['close'].last())
    df['mom_pct_change'] = df['close'].sub(s).div(s).mul(100)
    

                    close  mom_pct_change
    date                                 
    1/26/1990  421.299988             NaN
    1/29/1990  418.100006             NaN
    1/30/1990  410.700012             NaN
    1/31/1990  415.799988             NaN
    2/23/1990  419.500000        0.889854
    2/26/1990  421.000000        1.250604
    2/27/1990  422.600006        1.635406
    2/28/1990  425.799988        2.405002
    3/26/1990  438.799988        3.053077
    3/27/1990  439.500000        3.217476
    3/28/1990  436.700012        2.559893
    3/29/1990  435.399994        2.254581
    3/30/1990  435.500000        2.278068
    

    【讨论】:

    • shift(-1) 还是 shift(1)?
    • 应该是-1
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-11-19
    • 2020-03-10
    • 2019-03-16
    • 2018-09-26
    • 1970-01-01
    • 2021-08-14
    相关资源
    最近更新 更多