【问题标题】:Is there any way to add column name in all the row of a dataframe based on a specific condition in pandas?有没有办法根据熊猫中的特定条件在数据框的所有行中添加列名?
【发布时间】:2020-08-05 07:07:00
【问题描述】:
我有一个像这样的数据框:
Name Age Class Maths English Physics Bio Chemistry
A 13 7 1 None None 1 None
B 17 10 None 1 1 None None
我想添加一个名为 Subject 的新列,其中应包含列名(具有 1)作为主题字段的值,如下所示:
Name Age Class Subject
A 13 7 Maths, Bio
B 17 10 English, Physics
我尝试了几种方法,但花费的时间比平时长。
【问题讨论】:
标签:
python
python-3.x
pandas
pandas-groupby
【解决方案1】:
您可以将apply 与lambda 函数一起使用。
df['Subject'] = (df == '1').apply(lambda x: ','.join(df.columns[x]), axis=1)
df = df.iloc[:, [0,1,2,-1]]
df
Name Age Class Subject
0 A 13 7 Maths, Bio
1 B 17 10 English, Physics
【解决方案2】:
一、易读法:
subjects = ['Maths', 'English', 'Physics', 'Bio', 'Chemistry']
df['Subject'] = ""
for row in range(len(df.index)):
output = []
for i, col in enumerate(df.loc[df.index[row], subjects]):
if col == 1:
output.append(str(subjects[i]))
df.at[df.index[row], 'Subject'] = ", ".join(output)
【解决方案3】:
#extract subjects columns
subjects = df.iloc[:,3:].columns
#identify columns that are not na per row
notnull = df.filter(subjects).notna().to_numpy()
#get the non null columns and assign to subject column
#... still thinking of a non python loop ... glad if anyone can drop a better replacement
df['subjects'] = [subjects[row].str.cat(sep=', ') for row in notnull]
#drop subjects list
df.drop(subjects,axis=1)
Name Age Class subjects
0 A 13 7 Maths, Bio
1 B 17 10 English, Physics