【发布时间】:2021-01-01 20:44:35
【问题描述】:
我正在尝试将 CSV 文件转换为分层 JSON 文件。CSV 文件输入如下,它包含两列基因和疾病。
gene,disease
A1BG,Adenocarcinoma
A1BG,apnea
A1BG,Athritis
A2M,Asthma
A2M,Astrocytoma
A2M,Diabetes
NAT1,polyps
NAT1,lymphoma
NAT1,neoplasms
预期的输出格式应为以下格式
{
"name": "A1BG",
"children": [
{"name": "Adenocarcinoma"},
{"name": "apnea"},
{"name": "Athritis"}
]
},
{
"name": "A2M",
"children": [
{"name": "Asthma"},
{"name": "Astrocytoma"},
{"name": "Diabetes"}
]
},
{
"name": "NAT1",
"children": [
{"name": "polyps"},
{"name": "lymphoma"},
{"name": "neoplasms"}
]
}
我写的python代码如下。让我知道我需要更改哪里以获得所需的输出。
import json
finalList = []
finalDict = {}
grouped = df.groupby(['gene'])
for key, value in grouped:
dictionary = {}
dictList = []
anotherDict = {}
j = grouped.get_group(key).reset_index(drop=True)
dictionary['name'] = j.at[0, 'gene']
for i in j.index:
anotherDict['disease'] = j.at[i, 'disease']
dictList.append(anotherDict)
dictionary['children'] = dictList
finalList.append(dictionary)
with open('outputresult3.json', "w") as out:
json.dump(finalList,out)
【问题讨论】:
标签: python json pandas csv dictionary