【问题标题】:Import JSON file with dicts of dicts of nested dicts into Pandas将带有嵌套字典字典的 JSON 文件导入 Pandas
【发布时间】:2021-09-25 02:12:52
【问题描述】:

我正在下载a json file from the RKI in Germany(相当于CDC)。它似乎有字典里面的字典里面的字典。我真的只对嵌套在“特征”字典中的数据字典感兴趣。我的问题是这个字典中的每个条目都嵌套了相同的键 - “属性”。这就是文本的样子(我必须使用文本,因为由于代理问题,我无法将其直接下载到 python 中 - grrr。)。

{"objectIdFieldName":"ObjectId","uniqueIdField":
{"name":"ObjectId","isSystemMaintained":true},
"globalIdFieldName":"","fields":
    [{"name":"AdmUnitId","type":"esriFieldTypeInteger","alias":"AdmUnitId","sqlType":"sqlTypeInteger","domain":null,"defaultValue":null},
...etc...    {"name":"ObjectId","type":"esriFieldTypeOID","alias":"ObjectId","sqlType":"sqlTypeInteger","domain":null,"defaultValue":null}],
"features":
    [{"attributes":{"AdmUnitId":0,"BundeslandId":0,"AnzFall":3741781,"AnzTodesfall":91337,"AnzFallNeu":1456,"AnzTodesfallNeu":18,"AnzFall7T":7178,"AnzGenesen":3638200,"AnzGenesenNeu":700,"AnzAktiv":12300,"AnzAktivNeu":700,"Inz7T":8.6,"ObjectId":1}},
    {"attributes":{"AdmUnitId":1,"BundeslandId":1,"AnzFall":64221,"AnzTodesfall":1628,"AnzFallNeu":35,"AnzTodesfallNeu":1,"AnzFall7T":181,"AnzGenesen":62300,"AnzGenesenNeu":0,"AnzAktiv":300,"AnzAktivNeu":0,"Inz7T":6.2,"ObjectId":2}},
    {"attributes":{"AdmUnitId":2,"BundeslandId":2,"AnzFall":77823,"AnzTodesfall":1603,"AnzFallNeu":50,"AnzTodesfallNeu":0,"AnzFall7T":217,"AnzGenesen":75700,"AnzGenesenNeu":0,"AnzAktiv":500,"AnzAktivNeu":0,"Inz7T":11.7,"ObjectId":3}},
    ...etc

当我尝试 pd.read_json(the_file) 时,我得到了 Value Error: arrays must be all the length.

如果我以 json 格式打开并加载,创建一个字典,我会得到我的字典和我想要的字典。我几乎可以到达那里,如下所示,但我最终得到一个嵌套字典列表,其中的键始终是 - “属性” - 这会引发错误。

with open(r"Q:\AbisF\Covid-19\Lageberichte\Misc\RKI_7Tages.json") as json_data:
    data = json.load(json_data)
# dig down to the data
features = data["features"]
attributes = features["attributes"]   # TypeError: list indices must be integers or slices, not str

我想知道我是不是走错了路,是否有办法清理我的列表(摆脱属性级别)。

【问题讨论】:

    标签: python json pandas list dictionary


    【解决方案1】:

    我认为您的features = data["features"] 现在是dicts 中的list

    你可以遍历那些:

    features = data["features"]
    for feature in features:
        attributes = feature["attributes"]
        print(attributes['AdmUnitId'])  # example item in attributes
    

    【讨论】:

    • 好的,超级。然后我只需要转置结果并一次构建我的数据框。非常感谢。
    猜你喜欢
    • 2020-02-18
    • 1970-01-01
    • 1970-01-01
    • 2023-04-02
    • 1970-01-01
    • 1970-01-01
    • 2021-04-23
    • 2023-03-19
    • 2023-02-23
    相关资源
    最近更新 更多