【问题标题】:how to drop entire group when certain condition met in pandas当熊猫满足某些条件时如何删除整个组
【发布时间】:2020-05-16 16:48:41
【问题描述】:

我正在尝试在满足特定条件时删除所有数据组!

import pandas as pd


raw_data = {'regiment': ['51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st'], 
            'trucks': ['MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'ZIS-150', 'ZIS-150', 'ZIS-150', 'ZIS-150'],
            'drivers': ['MAZ', 'MAZ', 'IVE', 'IVE', 'MAN', 'MAN', 'MERC', 'TATA', 'TATA', 'MAN', 'REN', 'TATA'],


            'counts': [0,0,1,1,0,0,1,0, 1,2,3,4]}


df = pd.DataFrame(raw_data, columns = ['regiment', 'trucks','drivers','counts']) 

   regiment     trucks drivers  counts
0      51st   MAZ-7310     MAZ       0
1      51st   MAZ-7310     MAZ       0
2      51st   MAZ-7310     IVE       1
3      51st   MAZ-7310     IVE       1
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

当驱动程序是 MAZcounts == 0 时,我正在尝试删除 MAZ-7310

所以我关注了这个帖子Pandas groupby and filter

df = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] == 0) & (x['drivers'] == 'MAZ')).all())

但它似乎没有给我我需要的输出。

预期输出

    regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

我怎样才能得到这个输出?

谢谢

【问题讨论】:

  • 那么如果组中的一行有驱动程序MAZ 并且计数0 整个组应该被删除?
  • @Erfan 是的,你可以这么说!

标签: python pandas


【解决方案1】:

首先,我们分配一个名为m 的新列,它是drivers is MAZcounts is 0 所在行的布尔值。

然后我们使用GroupBy 并获取any m is True 所在的所有组。

然后我们使用布尔索引来获得与~相反的结果

使用的方法:

mask = (df.assign(m=(df['drivers'].eq('MAZ') & ~df['counts']))
          .groupby(['regiment','trucks'])['m'].transform('any')
       )

df[~mask]

   regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

【讨论】:

  • 哇,它比我想象的要复杂!我在哪里可以找到和学习您刚刚发布的相同语法?
  • 除了更广泛的解释外,我还列出了我使用的方法,以及文档链接。
【解决方案2】:

如您所愿,您需要使用any 而不是all。因此,只需在您的代码中将all 更改为any

df_final = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] ==0) 
                                                    & (x['drivers'] == 'MAZ')).any())

Out[234]:
   regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-12-17
    • 2018-01-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多