如果您有具有不同数量元素的元组,一个更通用的解决方案是创建一个自定义函数,如下所示
def create_columns_from_tuple(df, tuple_col):
# get max length of tuples
max_len = df[tuple_col].apply(lambda x: 0 if x is None else len(x)).max()
# select rows with non-empty tuples
df_full = df.loc[df[tuple_col].notna()]
# create dataframe with exploded tuples
df_full_exploded = pd.DataFrame(df_full[tuple_col].tolist(),
index=df_full.index,
columns=[tuple_col + str(n) for n in range(1, max_len+1)])
# merge the two dataframes by index
result = df.merge(df_full_exploded, left_index=True, right_index=True, how='left')
return result
在此函数中,您传递数据框和元组列的名称。该函数将自动创建与元组的最大长度一样多的列。
create_columns_from_tuple(df, tuple_col='b')
# a b b1 b2
# 0 NaN None NaN NaN
# 1 1.0 (1, 2) 1.0 2.0
# 2 2.0 (3, 4) 3.0 4.0
如果您的元组具有不同数量的元素:
df = pd.DataFrame({'a':[None,1, 2], 'b':[None, (1,2,42), (3,4)]})
create_columns_from_tuple(df, tuple_col='b')
# a b b1 b2 b3
# 0 NaN None NaN NaN NaN
# 1 1.0 (1, 2, 42) 1.0 2.0 42.0
# 2 2.0 (3, 4) 3.0 4.0 NaN