【发布时间】:2022-01-18 22:21:12
【问题描述】:
这是我在这里的第一个问题。我在这里和整个网络上进行了搜索,但似乎无法找到我的问题的答案。我正在尝试将 json 文件中的列表分解为多列和多行。到目前为止,我所做的一切都被证明是不成功的。
我正在对一个目录中的多个 json 文件执行此操作,以便像这样在数据框中打印出来。 目标:
| did | Version | Nodes | rds | time | c | sc | f | uc |
|---|---|---|---|---|---|---|---|---|
| did | Version | Nodes | rds | time | c | sc | f | uc |
| did | Version | Nodes | rds | time | c | sc | f | uc |
| did | Version | Nodes | rds | time | c | sc | f | uc |
相反,我在我的数据框中得到了这个:
| did | Version | Nodes | rds | fusage |
|---|---|---|---|---|
| did | Version | Nodes | rds | everything in fusage |
| did | Version | Nodes | rds | everything in fusage |
| did | Version | Nodes | rds | everything in fusage |
我正在使用的 json 示例。 json结构不会改变
{
"did": "123456789",
"mId": "1a2b3cjsks",
"timestamp": "2021-11-26T11:10:58.322000",
"beat": {
"did": "123456789",
"collectionTime": "2010-05-26 11:10:58.004783+00",
"Nodes": 6,
"Version": "v1.4.6-2",
"rds": "0.00B",
"fusage": [
{
"time": "2010-05-25",
"c": "string",
"sc": "string",
"f": "string",
"uc": "int"
},
{
"time": "2010-05-19",
"c": "string",
"sc": "string",
"f": "string",
"uc": "int"
},
{
"t": "2010-05-23",
"c": "string",
"sc": "string",
"f": "string",
"uc": "int"
},
{
"time": "2010-05-23",
"c": "string",
"sc": "string",
"f": "string",
"uc": "int"
}
]
}
}
我的最终目标是将数据帧输出到 csv 以便被摄取。感谢大家对此的帮助。
使用 python 3.8.10 和 pandas 1.3.4
下面的python代码
import csv
import glob
import json
import os
import pandas as pd
tempdir = '/dir/to/files/json_temp'
json_files = os.path.join(tempdir, '*.json')
file_list = glob.glob(json_files)
dfs = []
for file in file_list:
with open(file) as f:
data = pd.json_normalize(json.loads(f.read()))
dfs.append(data)
df = pd.concat(dfs, ignore_index=True)
df.explode('fusage')
print(df)
【问题讨论】:
标签: python json pandas pandas-explode