【问题标题】:Parse JSON data into Python将 JSON 数据解析为 Python
【发布时间】:2013-12-09 22:51:58
【问题描述】:

我有一个 JSON 文件,其中包含我需要的一些数据。我想编写一个python程序来读取它并获取信息。有什么帮助吗?

这是 JSON 的示例

var case_data = {
  "cases": {
    "1": {
      "amount": 1500.0, 
      "case_id": "1", 
      "case_name": "US v. Control Systems Specialist, Inc. and Darrold Richard Crites", 
      "country": "br", 
      "sector": "sector-defense"
    }, 
    "10": {
      "amount": 0.0, 
      "case_id": "10", 
      "case_name": "SEC v. Int'l Systems & Controls Corp.", 
      "country": "cl", 
      "sector": "sector-agriculture"
    }
  }, 
  "countries": {
    "ae": {
      "cases": [
        "191", 
        "192", 
        "193", 
        "282", 
        "332"
      ], 
      "sectors": {
        "sector-consulting": {
          "total": 1812113.33
        }, 
        "sector-energy": {
          "total": 6622147.0
        }, 
        "sector-infrastructure": {
          "total": 4694551.0
        }
      }, 
      "total": 13128811.33, 
      "tree": {
        "children": [
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 4550000.0, 
              "$area": 3
            }, 
            "id": "191", 
            "name": "Control Components Inc. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-infrastructure", 
            "data": {
              "$amount": 140551.0, 
              "$area": 2
            }, 
            "id": "192", 
            "name": "Textron Inc."
          }, 
          {
            "children": [], 
            "classname": "sector-infrastructure", 
            "data": {
              "$amount": 4554000.0, 
              "$area": 3
            }, 
            "id": "193", 
            "name": "York International Corp."
          }, 
          {
            "children": [], 
            "classname": "sector-consulting", 
            "data": {
              "$amount": 1812113.33, 
              "$area": 3
            }, 
            "id": "282", 
            "name": "Aon Corporation"
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 2072147.0, 
              "$area": 3
            }, 
            "id": "332", 
            "name": "Tyco Int\u2019l Ltd. et al."
          }
        ], 
        "data": {
          "$amount": 0, 
          "$area": 14
        }, 
        "id": "ae", 
        "name": "UAE"
      }
    }, 
    "ao": {
      "cases": [
        "5", 
        "9", 
        "207", 
        "208", 
        "209"
      ], 
      "sectors": {
        "sector-consulting": {
          "total": 12350000.0
        }, 
        "sector-energy": {
          "total": 18097043.0
        }, 
        "sector-telecom": {
          "total": 7080000.0
        }
      }, 
      "total": 37527043.0, 
      "tree": {
        "children": [
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 302043.0, 
              "$area": 2
            }, 
            "id": "5", 
            "name": "ABB Ltd. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 16335000.0, 
              "$area": 6
            }, 
            "id": "9", 
            "name": "Baker Hughes Inc. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 1460000.0, 
              "$area": 3
            }, 
            "id": "207", 
            "name": "GlobalSanteFe Corp."
          }, 
          {
            "children": [], 
            "classname": "sector-telecom", 
            "data": {
              "$amount": 7080000.0, 
              "$area": 3
            }, 
            "id": "208", 
            "name": "Alcatel-Lucent S.A. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-consulting", 
            "data": {
              "$amount": 12350000.0, 
              "$area": 6
            }, 
            "id": "209", 
            "name": "Panalpina World Transport (Holding) Ltd. et al."
          }
        ], 
        "data": {
          "$amount": 0, 
          "$area": 20
        }, 
        "id": "ao", 
        "name": "Angola"
      }
    }, 
  };

我想在每个国家/地区的“部门能源”之后提取数字。请注意,在此示例文件中,有两个国家“ae”和“ao”。

【问题讨论】:

标签: python json web-scraping


【解决方案1】:

如果你真的有 JSON。您所能做的就是使用json 模块将其解码为本机dict,就像您在JavaScript 中使用JSON 对象将其解码为本机object 一样。比较这个 JS:

var case_data = JSON.parse(data);

… 到等效的 Python:

case_data = json.loads(data)

一旦你这样做了,就不用担心 JSON 了;它只是普通的本机对象,您可以像访问字典和列表以及字符串和数字的任何其他组合一样访问它们。例如:

sector_energy = [country["sectors"]["sector-energy"] 
                 for country in case_data["countries"]]

但是,您向我们展示的根本不是 JSON;而是它是 JavaScript 源代码,它将复杂对象分配给变量。您不能用任何语言将其解析为 JSON,因为它不是。

当然,=; 之间的源代码部分不仅是有效的 JavaScript 代码,也是有效的 JSON。就此而言,它也是有效的 Python 和有效的 Ruby。但是如果你想解析这个文件和其他类似的文件,你需要想出规则来决定哪些片段代表你想要解析的 JSON。每个文件都只是一个 JS 变量赋值吗?还是有什么不同?

无论如何,实际使用 JSON 进行语言之间的交换几乎总是要好得多,而不是使用类似于 JSON 的东西并希望它能工作。

【讨论】:

    【解决方案2】:

    Python 已经为此提供了一个库

    http://docs.python.org/3.3/library/json.html

    【讨论】:

    • 这个答案如果包含一个例子而不是一个链接会更有用
    猜你喜欢
    • 2021-12-09
    • 2016-03-03
    • 2014-05-29
    • 2014-05-19
    • 2015-06-30
    • 2015-07-24
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多