【发布时间】:2022-01-21 04:28:12
【问题描述】:
我在 pandas 中有一个数据框:
import pandas as pd
# assign data of lists.
data = {'Gender': ['M', 'F', 'M', 'F','M', 'F','M', 'F','M', 'F','M', 'F'],
'Employment': ['R','U', 'E','R','U', 'E','R','U', 'E','R','U', 'E'],
'Age': ['Y','M', 'O','Y','M', 'O','Y','M', 'O','Y','M', 'O']
}
# Create DataFrame
df = pd.DataFrame(data)
df
我想要为每个现有列的每个类别创建一个具有以下格式的新列:
Gender_M -> for when the gender equals M
Gender_F -> for when the gender equal F
Employment_R -> for when employment equals R
Employment_U -> for when employment equals U
and so on...
到目前为止,我已经创建了以下代码:
for i in range(len(df.columns)):
curent_column=list(df.columns)[i]
col_df_array = df[curent_column].unique()
for j in range(col_df_array.size):
new_col_name = str(list(df.columns)[i])+"_"+col_df_array[j]
for index,row in df.iterrows():
if(row[curent_column] == col_df_array[j]):
df[new_col_name] = row[curent_column]
问题是,即使我成功地创建了列名,我也无法获得正确的列值。
例如列性别应该如下:
data2 = {'Gender': ['M', 'F', 'M', 'F','M', 'F','M', 'F','M', 'F','M', 'F'],
'Gender_M': ['M', 'na', 'M', 'na','M', 'na','M', 'na','M', 'na','M', 'na'],
'Gender_F': ['na', 'F', 'na', 'F','na', 'F','na', 'F','na', 'F','na', 'F']
}
df2 = pd.DataFrame(data2)
只是说,na 可以是任何东西,例如空白、点或 NAN。
【问题讨论】:
标签: pandas dataframe iteration