【发布时间】:2014-05-10 20:12:29
【问题描述】:
我有一个数据框,我正在尝试使用性别列更新性别列
import pandas as pd
import numpy as np
df=pd.DataFrame({'Users': [ 'Al Gore', 'Ned Flonders', 'Kim jong un', 'Al Sharpton', 'Michele', 'Richard Johnson', 'Taylor Swift', 'Alf pig', 'Dick Johnson', 'Dana Jovy'],
'Gender': [np.nan,'Male','Male','Male',np.nan,np.nan, 'Female',np.nan,'Male','Female'],
'Sex': ['M',np.nan,np.nan,'M','F',np.nan, 'F',np.nan,np.nan,'F']})
输出
>>>
Gender Sex Users
0 NaN M Al Gore
1 Male NaN Ned Flonders
2 Male NaN Kim jong un
3 Male M Al Sharpton
4 NaN F Michele
5 NaN NaN Richard Johnson
6 Female F Taylor Swift
7 NaN NaN Alf pig
8 Male NaN Dick Johnson
9 Female F Dana Jovy
[10 rows x 3 columns]
因此,如果“性别”列中为男性,则在性别列中将显示为 M。
到目前为止,这是我尝试过的:
df['Sex2']=(df.Gender.isin(['Male']).map({True:'M',False:''}) +
df.Sex.isin(['M']).map({True:'M',False:''}) +
df.Sex.isin(['F']).map({True:'F',False:''})+
df.Gender.isin(['Female']).map({True:'F',False:''}))
print(df)
输出
[10 rows x 3 columns]
Gender Sex Users Sex2
0 NaN M Al Gore M
1 Male NaN Ned Flonders M
2 Male NaN Kim jong un M
3 Male M Al Sharpton MM
4 NaN F Michele F
5 NaN NaN Richard Johnson
6 Female F Taylor Swift FF
7 NaN NaN Alf pig
8 Male NaN Dick Johnson M
9 Female F Dana Jovy FF
[10 rows x 4 columns]
我差点搞定了,但这可能效率不高
这是我想要的输出
>>>
Gender Sex Users
0 NaN M Al Gore
1 Male M Ned Flonders
2 Male M Kim jong un
3 Male M Al Sharpton
4 NaN F Michele
5 NaN NaN Richard Johnson
6 Female F Taylor Swift
7 NaN NaN Alf pig
8 Male M Dick Johnson
9 Female F Dana Jovy
[10 rows x 3 columns]
是否可以使用一些合并或更新功能来做到这一点?
【问题讨论】: