将 JSON 数据转换为 Python 数据框答案

【问题标题】：Converting JSON data to Python data frame将 JSON 数据转换为 Python 数据框
【发布时间】：2020-05-29 21:57:27
【问题描述】：

我正在尝试将 JSON 数据转换为 python 数据框。当我规范化 JSON 数据时，完整的数据作为系列对象存储在单个记录中。能否请您告诉我如何将以下 JSON 数据转换为 Python 数据框？

代码：

[{'Name':"SS",  
  'Order':[{'Type':'DO','Value':'10.11/7654326'},  
           {'Type':'UR','Value':'https://do.org/10.11/765436'}],
  'Order_Type':'dsggg',
  'Performance':[{'Per':{'Begin_Date':'2018-01-01','End_Date':'2018-02-02'},  
                  'Ins':[{'Me':'TT','Sales':2}]}]},
{'Name':"MM",
  'Order':[{'Type':'DO','Value':'10.11/7654326'},  
           {'Type':'UR','Value':'https://do.org/10.11/765436'}],
  'Order_Type':'dsggg',
  'Performance':[{'Per':{'Begin_Date':'2018-01-01','End_Date':'2018-02-02'},  
                  'Ins':[{'Me':'TT','Sales':2}]}]}

]

【问题讨论】：

您能告诉我们到目前为止您在代码中尝试过什么吗？
你做了什么来得到这个？也请分享代码
预期的 Dataframe 输出是什么？
代码：#Normalizing data df = pd.io.json.json_normalize(list_data) df = pd.read_json(Items, orient='columns') import re print(re.split('[{ ,|,|,',a)) Report_Item.dtypes Report_Item.mentions.head(10) AttributeError: 'Series' object has no attribute 'mentions' # 创建字符串列表 columns = ['Name', 'Order', 'Performance','Sales'] df = pd.DataFrame(response, columns=columns,index=index) df
我将列作为 Name、Do、UR、Order_Type、Beign_Date、End_Date、Sales 和 values 作为记录除外。

标签： python json

【解决方案1】：

如果你不需要使用循环来获取每个，你可以使用 Pandas。

import pandas as pd
from pandas.io.json import json_normalize

data = [{'Name':"SS",  
  'Order':[{'Type':'DO','Value':'10.11/7654326'},  
           {'Type':'UR','Value':'https://do.org/10.11/765436'}],
  'Order_Type':'dsggg',
  'Performance':[{'Per':{'Begin_Date':'2018-01-01','End_Date':'2018-02-02'},  
                  'Ins':[{'Me':'TT','Sales':2}]}]},
{'Name':"MM",
  'Order':[{'Type':'DO','Value':'10.11/7654326'},  
           {'Type':'UR','Value':'https://do.org/10.11/765436'}],
  'Order_Type':'dsggg',
  'Performance':[{'Per':{'Begin_Date':'2018-01-01','End_Date':'2018-02-02'},  
                  'Ins':[{'Me':'TT','Sales':2}]}]}]

df = pd.DataFrame.from_dict(json_normalize(data), orient='columns')

【讨论】：

但在 Order 或 Performace 列内部是嵌套的。 Name Order Order_Type Performance SS [{'Type': 'DO', 'Value': '10.11/7654326'}, {'T...dsggg [{'Per':{'Begin_Date': '2018-01-01 ', 'End_Dat... 1 MM [{'Type': 'DO', 'Value': '10.11/7654326'}, {'T... dsggg [{'Per': {'Begin_Date': '2018 -01-01', 'End_Dat...
你必须做一个 for 循环并获得每个级别
查看这个答案stackoverflow.com/questions/45168524/…
在 for 循环中它会抛出错误，因为“列表”对象不可调用。
我正在尝试使用 For 循环提取列及其对应的值。但由于列表对象不可调用，因此会引发错误。以下是代码。 cols = ['Name', 'Do', 'UR', 'Order_Type', 'Begin_Date','End_Date','Me','Sales'] rows = [] for data in temp: data_id = data['Name '] criteria = data['Order']['Performance'] for d in criteria: rows.append([data_id, criteria.index(d)+1, *list(d.values())[:-1] ])