【问题标题】:open JSON file with pandas DataFrame使用 pandas DataFrame 打开 JSON 文件
【发布时间】:2020-12-22 07:02:42
【问题描述】:

抱歉这个琐碎的问题:

我有一个 json 文件 first.json,我想用 pandas.read_json 打开它:

df = pandas.read_json('first.json') 给我下一个结果:

我需要的结果是一行以键('name'、'street'、'geo'、'servesCuisine' 等)作为列。我试图改变不同的“orient”参数,但它没有帮助。如何实现所需的DataFrame 格式?

这是我的 json 文件中的数据:

{
    "name": "La Continental (San Telmo)",
    "geo": {
        "longitude": "-58.371852",
        "latitude": "-34.616099"
    },
    "servesCuisine": "Italian",
    "containedInPlace": {},
    "priceRange": 450,
    "currenciesAccepted": "ARS",
    "address": {
        "street": "Defensa 701",
        "postalCode": "C1065AAM",
        "locality": "Autonomous City of Buenos Aires",
        "country": "Argentina"
    },
    "aggregateRatings": {
        "thefork": {
            "ratingValue": 9.3,
            "reviewCount": 3
        },
        "tripadvisor": {
            "ratingValue": 4,
            "reviewCount": 350
        }
    },
    "id": "585777"
}

【问题讨论】:

    标签: python json pandas dictionary


    【解决方案1】:

    你可以试试

    with open("test.json") as fp:
        s = json.load(fp)
    
    # flattened df, where nested keys -> column as `key1.key2.key_last`
    df = pd.json_normalize(s)
    
    # rename cols to innermost key only (be sure you don't overwrite cols)
    cols = {col:col.split(".")[-1] for col in df.columns}
    df = df.rename(columns=cols)
    

    输出:

                             name servesCuisine  priceRange currenciesAccepted      id  ...    country ratingValue reviewCount ratingValue reviewCount
    0  La Continental (San Telmo)       Italian         450                ARS  585777  ...  Argentina         9.3           3           4         350
    

    【讨论】:

      【解决方案2】:

      您可以使用 Python 命令读取 JSON 文件,将其转换为 dict 对象,然后手动挑选数据项以从中创建新的数据框。

      import pandas as pd
      
      # open/read the json data file
      fo  = open("test11.json", "r")
      injs = fo.read()
      #print(injs)
      inp_json = eval(injs)  #make it an object
      
      # Or 
      # inp_json = your_json_data
      
      # prepare 1 row of data
      axis1 = [[inp_json["name"], inp_json["address"]["street"], inp_json["geo"], inp_json["servesCuisine"],
                inp_json["aggregateRatings"]["tripadvisor"]["ratingValue"],
                inp_json["id"],
               ], ] #for data
      axis0 = ['row_1', ]  #for index
      heads = ["name", "add_.street", "geo", "servesCuisine",
              "agg_.tripadv_.ratingValue", "id", ]
      
      # create a dataframe using the prepped values above
      df0 = pd.DataFrame(axis1, index=axis0, columns=heads)
      
      
      # see data in selected columns only
      df0[["name","add_.street","id"]]
      
                                   name  add_.street      id
      row_1  La Continental (San Telmo)  Defensa 701  585777
      

      【讨论】:

        猜你喜欢
        • 2021-06-06
        • 2017-11-27
        • 1970-01-01
        • 2020-10-14
        • 1970-01-01
        • 2018-03-11
        • 2020-12-18
        • 1970-01-01
        • 2021-10-23
        相关资源
        最近更新 更多