【问题标题】:pandas groupby column to list and keep certain valuespandas groupby 列列出并保留某些值
【发布时间】:2021-05-14 12:02:29
【问题描述】:
我有以下数据框:
id occupations
111 teacher
111 student
222 analyst
333 cook
111 driver
444 lawyer
我创建了一个包含所有职业列表的新列:
new_df['occupation_list'] = df['id'].map(df.groupby('id')['occupations'].agg(list))
如何在occupation_list 中只包含teacher 和student 值?
【问题讨论】:
标签:
python
python-3.x
pandas
group-by
pandas-groupby
【解决方案1】:
你可以在groupby之前过滤:
to_map = (df[df['occupations'].isin(['teacher', 'student'])]
.groupby('id')['occupations'].agg(list)
)
df['occupation_list'] = df['id'].map(to_map)
输出:
id occupations occupation_list
0 111 teacher [teacher, student]
1 111 student [teacher, student]
2 222 analyst NaN
3 333 cook NaN
4 111 driver [teacher, student]
5 444 lawyer NaN
【解决方案2】:
你也可以这样做
df.groupby('id')['occupations'].transform(' '.join).str.split()
【解决方案3】:
您只需执行 groupby 并将列聚合到列表中:
df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
出来:
>>> df
id occupations
0 111 teacher
1 111 student
2 222 analyst
3 333 cook
4 111 driver
5 444 lawyer
>>> df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
id occupations
0 111 [teacher, student, driver]
1 222 [analyst]
2 333 [cook]
3 444 [lawyer]