一种选择是explode 然后map 使用来自Id. 和Item. 列的映射器,然后groupby aggregate 回到list:
df['similar_item'] = (
df['Similar_item_id'].explode()
.map(dict(zip(df['Id.'], df['Item.'])))
.groupby(level=0).agg(list)
)
df:
Id. Item. Similar_item_id similar_item
0 1.0 Pen. [2, 1] [Book., Pen.]
1 2.0 Book. [1, 4] [Pen., Laptop.]
2 3.0 Phone. 4 [Laptop.]
3 4.0 Laptop. 3 [Phone.]
或有条件地基于长度:
df['similar_item'] = (
df['Similar_item_id'].explode()
.map(dict(zip(df['Id.'], df['Item.'])))
.groupby(level=0).agg(lambda g: list(g) if len(g) > 1 else g)
)
Id. Item. Similar_item_id similar_item
0 1.0 Pen. [2, 1] [Book., Pen.]
1 2.0 Book. [1, 4] [Pen., Laptop.]
2 3.0 Phone. 4 Laptop.
3 4.0 Laptop. 3 Phone.
DataFrame 构造函数:
import pandas as pd
df = pd.DataFrame({
'Id.': [1.0, 2.0, 3.0, 4.0],
'Item.': ['Pen.', 'Book.', 'Phone.', 'Laptop.'],
'Similar_item_id': [[2, 1], [1, 4], 4, 3]
})