【发布时间】:2019-12-03 02:13:16
【问题描述】:
data = {
'org_id' :[4,73,6,77,21,36,40,22,21,30,31],
'flag': [['4', '73'],['73'],['6', '77'],['77'],['21'],['36', '36'],['40'],['22', '41'],['21'],['22', '30'],['31', '31']],
'r_id' : [4,4,6,6,20,20,20,22,28,28,28]
}
df = pd.DataFrame.from_dict(data)
df
所需的数据框如下所示,
data = {
'org_id' :[4,73,6,77,21,36,40,22,21,30,31],
'flag': [['4', '73'],['73'],['6', '77'],['77'],['21'],['36', '36'],['40'],['22', '41'],['21'],['22', '30'],['31', '31']],
'r_id' : [4,4,6,6,20,20,20,22,28,28,28],
'is_foundin_org_id': ['yes','yes','yes','yes','NO','NO','NO','yes','NO','NO','NO']
}
df2 = pd.DataFrame.from_dict(data)
df2
输出数据帧
Out[115]:
org_id flag r_id is_foundin_org_id
0 4 [4, 73] 4 yes
1 73 [73] 4 yes
2 6 [6, 77] 6 yes
3 77 [77] 6 yes
4 21 [21] 20 NO
5 36 [36, 36] 20 NO
6 40 [40] 20 NO
7 22 [22, 41] 22 yes
8 21 [21] 28 NO
9 30 [22, 30] 28 NO
10 31 [31, 31] 28 NO
需要通过r_id分组后识别r_id是否存在于r_id的分组行中,例如。当我在 org_id 的一行中发现我按 4 分组时,因此我为第 4 组标记为是,同样,在 org_id 列中未找到 20,因此我为所有 20s 组标记为否。谢谢你。
【问题讨论】:
-
flag是干什么用的,如果和问题无关的话可以去掉
标签: pandas dataframe pandas-groupby