【发布时间】:2021-10-24 06:51:29
【问题描述】:
我有一个数据框,其中一列作为列表,另一列作为字典。然而,这并不一致。它也可以是单个元素或 NULL。此外,它们被解析为字符串类型。数据框如下所示:
df = pd.DataFrame({'item_id':[1,2,3,4],
'shop_id':[['S1', 'S2', 'S3', 'S4'],'S2','S3',['S1', 'S2', 'S3', 'S4']],
'price':[{'10':['S1','S2'], '20':['S3'], '30':['S4']},'50','NaN',{'10':['S1','S2','S3'],'25':['S4']}]})
+-------+---------+--------------------+----------------------------------------------------+
| Index | item_id | shop_id | price |
+-------+---------+--------------------+----------------------------------------------------+
| 0 | 1 | '[S1, S2, S3, S4]' | '{'10': ['S1', 'S2'], '20': ['S3'], '30': ['S4']}' |
| 1 | 2 | 'S2' | '50' |
| 2 | 3 | 'S3' | 'NaN' |
| 3 | 4 | '[S1, S2, S3, S4]' | '{'10': ['S1', 'S2', 'S3'], '25': ['S4']}' |
+-------+---------+--------------------+----------------------------------------------------+
我希望将其扩展为:
+-------+---------+---------+-------+
| Index | item_id | shop_id | price |
+-------+---------+---------+-------+
| 0 | 1 | S1 | 10 |
| 1 | 1 | S2 | 10 |
| 2 | 1 | S3 | 20 |
| 3 | 1 | S4 | 30 |
| 4 | 2 | S2 | 50 |
| 5 | 3 | S3 | NaN |
| 6 | 4 | S1 | 10 |
| 7 | 4 | S2 | 10 |
| 8 | 4 | S3 | 10 |
| 9 | 4 | S4 | 25 |
+-------+---------+---------+-------+
实现这一目标的最佳方法是什么?任何建议表示赞赏。谢谢!
【问题讨论】:
-
我们是否保证字典中的 S 值不会映射到多个值?
-
@亨利·埃克。是的。这是有保证的:)
标签: python json pandas dataframe dictionary