【问题标题】:Parse Multi-Level JSON Python解析多级 JSON Python
【发布时间】:2019-04-05 12:24:08
【问题描述】:

JSON 响应 -

{
  "001": {
    "STUDENTTYPE": {
      "TYPE": "Boarder"
    },
    "ACADEMICS": [
      {
        "SCI": 42,
        "MTH": 22
      },
      {
        "SCI": 49,
        "MTH": 36
      },
      {
        "SCI": 42,
        "MTH": 26
      }
    ],
    "ROLL": "001",
    "NAME": "Ben",
    "CLASS": "XI",
    "CLASSTEACHER": "Aka",
    "HOME": "Katrasgarh"
  },
  "002": {
    "STUDENTTYPE": {
      "TYPE": "DayScholar"
    },
    "ACADEMICS": [
      {
        "SCI": 43,
        "MTH": 24
      },
      {
        "SCI": 43,
        "MTH": 36
      },
      {
        "SCI": 47,
        "MTH": 28
      }
    ],
    "ROLL": "002",
    "NAME": "Bee",
    "CLASS": "XI",
    "CLASSTEACHER": "Ama",
    "HOME": "Kats"
  }
  ....
}

我无法获取内部 JSON。这是我到目前为止所做的 -

jsonLocation = sys.argv[1]
jsonFile = open(jsonLocation, 'rb')
jsonData = json.load(jsonFile)

for rollNo in jsonData:
print(rollNo)
for studentItems in jsonData[rollNo]:
     print(studentItems['ROLL'])
     print(studentItems['NAME'])
     print(studentItems['CLASS'])
     print(studentItems['CLASSTEATCHER'])
     print(studentItems['HOME'])
     print(studentItems['STUDENTTYPETYPE']['TYPE'])

我确实得到了studentItems 中每个键的值,但这在我看来是一种笨拙的方法。我也尝试过json.dump,但它失败了,错误是 JSON 是不可序列化的。 有没有更好的方法来遍历这种 JSON 格式?

这是我正在寻找的示例输出 -

001:

001
Ben
XI
Aka
Katrasgarh

Boarder

42,22
49,36
42,26

002:

002
Bee
XI
Ama
Kats
..
.

【问题讨论】:

  • 您想要哪些键作为响应,请将其添加到您的问题中
  • studentItems 里面的键。
  • 对不起,请您用示例输出更新您的问题
  • 我已经更新了输出。我并不真正关心输出,而是正确访问元素的方式。
  • json.load 返回的结构是一个嵌套的 Python 字典。要遍历键,您需要遍历 dict 本身。要仅遍历值,请遍历 dict.values()。成对地遍历键和值,遍历dict.items()。最后一个看起来像for key, value in jsonData[rollNo].items():

标签: python json python-3.x parsing


【解决方案1】:

有点不清楚您希望输出的外观如何,但我继续将嵌套的 json 展平,然后将其重建为数据框。从那里,您可以通过切片/过滤表来访问数据,写入 csv,或做任何您想做的事情。但基本上每一行将代表ROLL,带有属性,以及相应的科学和数学分数,索引号从0开始。如果一些学生在ACADEMICS键中有更长的列表,你会有行考试成绩最低的学生为空。

鉴于:

jsonData = {
  "001": {
    "STUDENTTYPE": {
      "TYPE": "Boarder"
    },
    "ACADEMICS": [
      {
        "SCI": 42,
        "MTH": 22
      },
      {
        "SCI": 49,
        "MTH": 36
      },
      {
        "SCI": 42,
        "MTH": 26
      }
    ],
    "ROLL": "001",
    "NAME": "Ben",
    "CLASS": "XI",
    "CLASSTEACHER": "Aka",
    "HOME": "Katrasgarh"
  },
  "002": {
    "STUDENTTYPE": {
      "TYPE": "DayScholar"
    },
    "ACADEMICS": [
      {
        "SCI": 43,
        "MTH": 24
      },
      {
        "SCI": 43,
        "MTH": 36
      },
      {
        "SCI": 47,
        "MTH": 28
      }
    ],
    "ROLL": "002",
    "NAME": "Bee",
    "CLASS": "XI",
    "CLASSTEACHER": "Ama",
    "HOME": "Kats"
  }

}

代码:

import json
import pandas as pd
import re

def flatten_json(y):
    out = {}
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
    flatten(y)
    return out


flat = flatten_json(jsonData)



results = pd.DataFrame()
columns_list = list(flat.keys())
for item in columns_list:
    row_idx = re.findall(r'(\d+)\_', item )[0]
    column = item.replace(row_idx + '_', '')
    row_idx = int(row_idx)
    value = flat[item]

    results.loc[row_idx, column] = value

输出:

print (results.to_string())
  STUDENTTYPE_TYPE  ACADEMICS_0_SCI  ACADEMICS_0_MTH  ACADEMICS_1_SCI  ACADEMICS_1_MTH  ACADEMICS_2_SCI  ACADEMICS_2_MTH ROLL NAME CLASS CLASSTEACHER        HOME
1          Boarder             42.0             22.0             49.0             36.0             42.0             26.0  001  Ben    XI          Aka  Katrasgarh
2       DayScholar             43.0             24.0             43.0             36.0             47.0             28.0  002  Bee    XI          Ama        Kats

【讨论】:

    猜你喜欢
    • 2021-12-28
    • 2016-06-24
    • 1970-01-01
    • 2021-10-18
    • 2013-02-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-03-28
    相关资源
    最近更新 更多