【问题标题】:Pandas Groupby columns and get a frequency of 0Pandas Groupby 列并获得频率为 0
【发布时间】:2020-11-17 11:14:29
【问题描述】:

我有一个数据框,我想按 Col1 Col2 Col3 分组并获得 Value 列的 0 频率: df =

Col1 Col2 Col3 Value
Val1 Val2  A    0
Val1 Val2  A    1
Val1 Val2  A    2
Val1 Val2  A    0
Val1 Val2  A    1

Val1 Val2  B    0
Val1 Val2  B    0
Val1 Val2  B    0
Val1 Val2  B    0
Val1 Val2  B    1
...

如何应用groupby来实现

Col1 Col2 Col3 Fercentage_of_0
Val1 Val2  A       0.2
Val1 Val2  B       0.8
...

谢谢!

【问题讨论】:

  • df['Value'].eq(0).groupby([df['Col1'],df['Col2'],df['Col3']]).mean()?
  • @QuangHoang 谢谢!你从哪里学来的?

标签: python pandas group-by


【解决方案1】:

一个简单的lambda 函数为您完成。生成一个列表,其中Value==0 获取此列表的 len 和组中项目的 len。你有百分比

df = pd.DataFrame({"Col1":["Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1"],"Col2":["Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2"],"Col3":["A","A","A","A","A","B","B","B","B","B"],"Value":[0,1,2,0,1,0,0,0,0,1]})

df.groupby(["Col1","Col2","Col3"]).\
    agg({"Value":lambda x: len([v for v in x if v==0])/len(x)})

输出

                Value
Col1 Col2 Col3       
Val1 Val2 A       0.4
          B       0.8

【讨论】:

    【解决方案2】:

    在数据帧上使用 groupby,然后在结果数据帧上应用 size() 方法。 例如,假设您创建了一个名为 df 的数据框,其中包含这些值

    df = pd.DataFrame({'Col1': ['Val1','Val1','Val1','Val1','Val1','Val1','Val1','Val1'], 
                   'Col2': ['Val2','Val2','Val2','Val2','Val2','Val2','Val2','Val2'],
                   'Col3': ['A','A','A','A','B','B','B','B'],
                   'Value':[0,1,2,0,0,0,0,1]}) 
    

    然后可以使用

    找到单个元素的频率计数
    df.groupby(['Col1','Col2','Col3','Value']).size()
    Col1  Col2  Col3  Value
    Val1  Val2  A     0        2
                      1        1
                      2        1
                B     0        3
                      1        1
    dtype: int64
    

    【讨论】:

      【解决方案3】:

      这是另一种不使用 lambda 的方法,我觉得这更容易理解:

      df['is_zero'] = df['Value'] == 0
      df.groupby(['Col1', 'Col2', 'Col3'])['is_zero'].mean()
      

      【讨论】:

        【解决方案4】:

        Value 创建一个等于0 的布尔列,并在Col 列上进行分组

        (
            df.assign(Percentage_Of_0=lambda x: x.Value.eq(0))
            .groupby(["Col1", "Col2", "Col3"], as_index=False)
            .Percentage_Of_0.mean()
        )
        
            Col1    Col2    Col3    Percentage_Of_0
        0   Val1    Val2    A       0.4
        1   Val1    Val2    B       0.8
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 2019-09-16
          • 2020-05-19
          • 2019-11-04
          • 2013-07-14
          • 2016-06-07
          • 2018-03-07
          相关资源
          最近更新 更多