【发布时间】:2021-07-11 04:27:22
【问题描述】:
有一个 pandas 数据框,其记录如下所示:
0 [/computers_&_electronics,/computers_&_electronics/electronics_&_electrical,/computers_&_electronics/electronics_&_electrical/data_sheets_&_electronics_reference,/shopping,/shopping/consumer_resources,/shopping/consumer_resources/coupons_&_discount_offers]
1 [/sports,/sports/college_sports,/sports/sporting_goods,/sports/sporting_goods/basketball_equipment,/sports/team_sports,/sports/team_sports/basketball]
2 [/business_&_industrial,/business_&_industrial/advertising_&_marketing,/business_&_industrial/advertising_&_marketing/sales,/law_&_government,/law_&_government/legal,/law_&_government/legal/product_liability,/shopping,/shopping/consumer_resources]
我想将每个层次结构(例如:/sport/college)作为数组元素读取,然后执行操作。但由于层次结构中没有引号(理想情况下应该是 '/sport/college',.. .) 每条记录都被读取为一个大字符串。
我尝试了literal_eval,但没有成功。还有其他指针吗?大约有 700 万条记录需要对其执行数组转换,因此需要一种快速且可扩展的方法
【问题讨论】:
标签: python arrays python-3.x pandas