【问题标题】:How to groupby with certain condition in pandas dataframe如何在熊猫数据框中以特定条件分组
【发布时间】:2017-01-12 22:43:19
【问题描述】:

我有这样的数据框

    A   B
0   1   a
1   2   a
2   3   b
3   4   b
4   5   a

我想得到下面的结果(1行*4列数据框),

A_count_all means the number of rows in dataframe   df.A.count()

A_sum_all means the df.A.sum()

A_count_a is df.loc[df.B==a,"A"].count()

A_sum_a is df.loc[df.B==a,"A"].sum()


    A_count_all   A_sum_all  A_count_a   A_sum_a  
0      5            15          3            8

我怎样才能得到这个结果数据框?

【问题讨论】:

  • 实际上,我可以得到每个元素的值,但是我不知道如何重建数据框..但是我可以通过回答来学习。但我应该发布我尝试过的内容。下次我会小心的。谢谢你的回复。

标签: python pandas dataframe group-by sum


【解决方案1】:

你可以使用DataFrame构造函数:

A_count_all = df.A.count()
A_sum_all = df.A.sum()
A_count_a = df.loc[df.B=='a',"A"].count()
A_sum_a = df.loc[df.B=='a',"A"].sum()

print (pd.DataFrame({'A_count_all':A_count_all, 
                     'A_sum_all':A_sum_all,
                     'A_count_a':A_count_a,
                     'A_sum_a':A_sum_a},
                      index=[0],
                      columns=['A_count_all','A_sum_all','A_count_a','A_sum_a']))

   A_count_all  A_sum_all  A_count_a  A_sum_a
0            5         15          3        8

感谢Kris 提供另一个解决方案:

print (pd.DataFrame(data=[[df.A.count(),
                          df.A.sum(),
                          df.loc[df.B=='a',"A"].count(),
                          df.loc[df.B=='a',"A"].sum()]],
                          columns=['A_count_all','A_sum_all','A_count_a','A_sum_a']))

   A_count_all  A_sum_all  A_count_a  A_sum_a
0            5         15          3        8

【讨论】:

  • 或者一口气搞定:pd.DataFrame(data=[[df.A.count(),df.A.sum(),df.loc[df.B=='a',"A"].count(),df.loc[df.B=='a',"A"].sum()]],columns=["A_count_all","A_sum_all","A_count_a","A_sum_a"])
  • 谢谢,我加了回答。
猜你喜欢
  • 2022-01-25
  • 2021-11-13
  • 2019-02-15
  • 1970-01-01
  • 2019-05-15
  • 1970-01-01
  • 2018-06-20
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多