使用 dicts 列表转换列，其中列名和值都作为 dict 键中的值存在答案

【问题标题】：Convert a column with a list of dicts, where column name and value are both present as values inside dict keys使用 dicts 列表转换列，其中列名和值都作为 dict 键中的值存在
【发布时间】：2020-01-08 01:09:04
【问题描述】：

这个问题与其他问题不同，因为在它们中，列名都不存在于键的值中...请查看在标记为重复之前给出的示例。

我有一个像这样的df：

df: col1 col2 col3
    100  200  [{'attribute': 'Pattern', 'value': 'Printed'},...

仔细看看第 3 列的样子：

[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]

其中每个属性是列名，每个值是该列名的值。

输出将如下所示：

df: col1   col2    Pattern       Topwear Style       Bottomwear Length ....
    100    200     Printed       T shirt             Shorts

有多行重复和新的属性和值。我将如何在 pandas 中执行此操作？我尝试搜索类似的东西，但找不到任何有用的东西。

【问题讨论】：

到目前为止你尝试了什么？
不，这就是为什么我不得不再次发布它

标签： python pandas list dictionary

【解决方案1】：

尝试：

df=df.join(pd.concat([pd.DataFrame(v).set_index('attribute').T 
               for v in df.pop('col3')]).reset_index(drop=True))

设置：

d=[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]
df=pd.DataFrame({'a':100,'b':200,'col3':[d]},index=[0])

输出：

【讨论】：

【解决方案2】：

x = df['col3'].tolist()
newcol = {item['attribute'] : [item['value']] for item in x }
newdf = pd.DataFrame(newcol)
del df['col3'] 
print(df.join(newdf, how='right'))

输出

   col1  col2  Pattern Topwear style Bottomwear Length  Colour Palette  \
0   100   200  Printed       T shirt             Short  Bright colours  
...

用于测试的数据框。

data = {'col1':100, 'col2': 200, 'col3': [{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]}

df = pd.DataFrame(data)

【讨论】：

@jezrael 当然，检查更新。
是的，答案的问题和OP的格式不同，所以OP数据失败了。
对于像我这样的熊猫新手来说，你的解决方案肯定更容易理解。
不，也很简单，感谢您的回答，点赞。

【解决方案3】：

您可以使用嵌套列表推导和字典推导来获取可能传递给DataFrame 构造函数的字典列表：

优点是性能更好，缺点有点复杂。

d = [{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'}
]

df = pd.DataFrame({'col1':[100, 20], 'col2':[200, 10], 'col3':[d, d]})
print (df)

   col1  col2                                               col3
0   100   200  [{'attribute': 'Pattern', 'value': 'Printed'},...
1    20    10  [{'attribute': 'Pattern', 'value': 'Printed'},...

a = [{y['attribute']: y['value']  for y in x for k, v in y.items()} for x in df.pop('col3')]

df = df.join(pd.DataFrame(a))
print (df)
   col1  col2  Pattern Topwear style Bottomwear Length  Colour Palette
0   100   200  Printed       T shirt             Short  Bright colours
1    20    10  Printed       T shirt             Short  Bright colours

【讨论】：

你为什么还要在 y.tems() 中做 for k, v 部分？
@piyushdaga - 因为它是扁平化的，类似于来自this的列表的想法
@piyushdaga - 这里没有必要，因为使用了y['attribute']: y['value']