【发布时间】:2022-01-23 18:29:50
【问题描述】:
我有以下熊猫数据框:
pd.DataFrame({'keys': {3: 'brandId', 5: 'price', 14: 'sizes', 18: 'brandId', 20: 'price', 29: 'sizes', 30: 'condition', 31: 'condition', 32: 'colour', 33: 'age', 36: 'brand', 40: 'colour', 41: 'brand', 44: 'productType', 50: 'brandId', 52: 'price', 61: 'sizes', 62: 'condition', 63: 'colour', 64: 'age', 67: 'brand', 70: 'productType'}, 'values': {3: 925, 5: {'currencyName': 'GBP', 'priceAmount': '50.00', 'nationalShippingCost': '3.00'}, 14: {'id': 4, 'name': 'UK 4', 'quantity': 1}, 18: 925, 20: {'currencyName': 'GBP', 'priceAmount': '11.00', 'nationalShippingCost': '0.00'}, 29: {'id': 3, 'name': 'S', 'quantity': 1}, 30: {'id': 'used_like_new', 'name': 'Like new'}, 31: {'id': 'brand_new', 'name': 'Brand new'}, 32: {'id': 'multi', 'name': 'Multi'}, 33: {'id': 'modern', 'name': 'Modern'}, 36: 'chinese-laundry', 40: {'id': 'white', 'name': 'White'}, 41: 'chinese-laundry', 44: 'tshirts', 50: 925, 52: {'currencyName': 'GBP', 'priceAmount': '20.00', 'nationalShippingCost': '3.00'}, 61: {'id': 11, 'name': 'M', 'quantity': 1}, 62: {'id': 'brand_new', 'name': 'Brand new'}, 63: {'id': 'black', 'name': 'Black'}, 64: {'id': '90s', 'name': '90s'}, 67: 'chinese-laundry', 70: 'jackets'}})
看起来像这样:
keys values
3 brandId 925
5 price {'currencyName': 'GBP', 'priceAmount': '50.00'...
14 sizes {'id': 4, 'name': 'UK 4', 'quantity': 1}
18 brandId 925
20 price {'currencyName': 'GBP', 'priceAmount': '11.00'...
29 sizes {'id': 3, 'name': 'S', 'quantity': 1}
30 condition {'id': 'used_like_new', 'name': 'Like new'}
...
我想为属于其键的特定值展平字典。例如,在任何其他字典键中仅获取来自priceAmount 的值,以及来自name 的值。
所以预期的输出:
keys values
3 brandId 925
5 price 50.00
14 sizes UK 4
18 brandId 925
20 price 11.00
29 sizes S
30 condition Like new}
我可以使用以下内容来做到这一点,如果我要替换更多内容,这将花费很长时间!
price_data = []
for price in data[data['keys'].str.contains('price', na=False)].values:
price_data.append(price[1]['priceAmount'])
condition_data = []
for condition in data[data['keys'].str.contains('condition', na=False)].values:
condition_data.append(condition[1]['name'])
age_data = []
for age in data[data['keys'].str.contains('age', na=False)].values:
age_data.append(age[1]['name'])
sizes_data = []
for sizes in data[data['keys'].str.contains('sizes', na=False)].values:
sizes_data.append(sizes[1]['name'])
colour_data = []
for colour in data[data['keys'].str.contains('colour', na=False)].values:
colour_data.append(colour[1]['name'])
#replace the values
data=data.replace(data[data['keys'].str.contains('price', na=False)]['values'].values, price_data)
data=data.replace(data[data['keys'].str.contains('condition', na=False)]['values'].values, condition_data)
data=data.replace(data[data['keys'].str.contains('age', na=False)]['values'].values, age_data)
data=data.replace(data[data['keys'].str.contains('sizes', na=False)]['values'].values, sizes_data)
data=data.replace(data[data['keys'].str.contains('colour', na=False)]['values'].values, colour_data)
有没有更快更流畅的替代方案?
【问题讨论】: