【发布时间】:2021-11-04 01:28:19
【问题描述】:
我知道这个问题已经被问过很多次了,但没有一个答案能满足我的要求。我想将 任何嵌套的 JSON 动态转换为 CSV 文件或 Dataframe。一些示例如下:
input : {"menu": {
"header": "SVG Viewer",
"items": [
{"id": "Open"},
{"id": "OpenNew", "label": "Open New"},
null,
{"id": "ZoomIn", "label": "Zoom In"},
{"id": "ZoomOut", "label": "Zoom Out"},
{"id": "OriginalView", "label": "Original View"},
null,
{"id": "Quality"},
{"id": "Pause"},
{"id": "Mute"},
null,
{"id": "Find", "label": "Find..."},
{"id": "FindAgain", "label": "Find Again"},
{"id": "Copy"},
{"id": "CopyAgain", "label": "Copy Again"},
{"id": "CopySVG", "label": "Copy SVG"},
{"id": "ViewSVG", "label": "View SVG"},
{"id": "ViewSource", "label": "View Source"},
{"id": "SaveAs", "label": "Save As"},
null,
{"id": "Help"},
{"id": "About", "label": "About Adobe CVG Viewer..."}
]
}}
input 2 : {"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
到目前为止,我已经尝试了下面的代码,它工作正常,但它将列表类型的数据分解为列,但我希望它按行分解。
from pandas.io.json import json_normalize
import pandas as pd
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '.')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '.')
i += 1
else:
out[str(name[:-1])] = str(x)
flatten(y)
return out
def start_explode(data):
if type(data) is dict:
df = pd.DataFrame([flatten_json(data)])
else:
df = pd.DataFrame([flatten_json(x) for x in data])
df = df.astype(str)
return df
complex_json = {"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
df = start_explode(complex_json['menu'])
display(df)
它为上述输入之一提供如下输出:
【问题讨论】:
-
请检查How to Ask。到目前为止,您尝试了哪些方法,您自己无法解决哪些特定问题?
-
更新了问题@buran
标签: python json pandas dataframe pyspark