【问题标题】:How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?如何按两列分组,然后计算每个分组在第三列中每个唯一值的出现次数?
【发布时间】:2021-03-17 09:42:08
【问题描述】:

我有一个唯一标识符,我想按 ["EMID"] 和日期列 ["DateNew"] 对其进行分组。然后我想计算 BRalpha 中每个值在每个分组中出现的次数。

数据集:

EMID DateNew BRalpha
SIM10001 2016-06-01 LUMB
SIM10001 2016-06-01 LUMB
SIM10001 2016-07-01 LUMB
SIM10001 2016-07-01 THOR
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10003 2017-03-01 ANFT
SIM10003 2017-03-01 ANFT

期望的输出:

EMID DateNew Count_LUMB Count_THOR Count_NSPC Count_ANFT
SIM10001 2016-06-01 2 0 0 0
SIM10001 2016-07-01 1 1 0 0
SIM10002 2016-02-01 0 0 5 0
SIM10003 2017-03-01 0 0 0 2

【问题讨论】:

    标签: python pandas dataframe pandas-groupby


    【解决方案1】:
    print(
        df.groupby(["EMID", "DateNew", "BRalpha"])
        .size()
        .unstack()
        .fillna(0)
        .astype(int)
        .add_prefix("count_")
        .reset_index()
    )
    

    打印:

    BRalpha      EMID     DateNew  count_ANFT  count_LUMB  count_NSPC  count_THOR
    0        SIM10001  2016-06-01           0           2           0           0
    1        SIM10001  2016-07-01           0           1           0           1
    2        SIM10002  2016-02-01           0           0           5           0
    3        SIM10003  2017-03-01           2           0           0           0
    

    【讨论】:

      猜你喜欢
      • 2016-12-20
      • 2022-07-06
      • 1970-01-01
      • 1970-01-01
      • 2019-01-13
      • 1970-01-01
      • 2011-05-12
      • 1970-01-01
      • 2016-09-24
      相关资源
      最近更新 更多