【问题标题】:How to change into dataframe from nested json response coming from api如何从来自 api 的嵌套 json 响应更改为数据框
【发布时间】:2021-03-13 09:22:40
【问题描述】:
{
    "reviews": [
        {
            "reviewId": "12a3",
            "authorName": "Muhammad Arifin",
            "comments": [
                {
                    "userComment": {
                        "text": "\tsangat terbantu????",
                        "lastModified": {
                            "seconds": "1606819245",
                            "nanos": 835000000
                        },
                        "starRating": 5,
                        "reviewerLanguage": "id",
                        "device": "1601",
                        "androidOsVersion": 23,
                        "appVersionCode": 20365,
                        "appVersionName": "5.2.73",
                        "deviceMetadata": {
                            "productName": "1601 (1601)",
                            "manufacturer": "Vivo",
                            "deviceClass": "FORM_FACTOR_PHONE",
                            "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                            "cpuModel": "MT6750",
                            "cpuMake": "Mediatek"
                        }
                    }
                },
                {
                    "developerComment": {
                        "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                        "lastModified": {
                            "seconds": "1606818598",
                            "nanos": 722000000
                        }
                    }
                }
            ]
        }
    ]
    "tokenPagination": {
        "nextPageToken": "abc"
    }
}

我希望列名称为 reviewId、authorName、userComment_text、userComment_lastModified、starRating、deviceMetadata.manufacturer、developerComment.text

我试过这个:

df=pd.json_normalize(fetch_reviews_response, record_path="reviews")

但它只创建 reviewId、authorName 和 cmets 列

【问题讨论】:

    标签: python list dataframe


    【解决方案1】:

    请尝试repo 看看是否可行。

    它使用递归函数来实现这一点。 'json_to_csv.py' 中的函数可以通过简单地使用 'pandas.read_json' 加载将平面 json 结果转换为数据帧来轻松移植以供您使用。

    【讨论】:

      【解决方案2】:

      首先我重组了json文件,如下所示:

          {
      "reviews": {
      
          "reviewId": "12a3",
          "authorName": "Muhammad Arifin",
          "comments": {
              "userComment": {
                      "text": "\tsangat terbantu?",
                      "lastModified": {
                          "seconds": "1606819245",
                          "nanos": 835000000
                      },
                      "starRating": 5,
                      "reviewerLanguage": "id",
                      "device": "1601",
                      "androidOsVersion": 23,
                      "appVersionCode": 20365,
                      "appVersionName": "5.2.73",
                      "deviceMetadata": {
                          "productName": "1601 (1601)",
                          "manufacturer": "Vivo",
                          "deviceClass": "FORM_FACTOR_PHONE",
                          "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                          "cpuModel": "MT6750",
                          "cpuMake": "Mediatek"
                      }
                  },
      
                  "developerComment": {
                      "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                      "lastModified": {
                          "seconds": "1606818598",
                          "nanos": 722000000
                      }
                  }
              }
      
      
      ,
      "tokenPagination": {
          "nextPageToken": "abc"
      }
      }
      }
      

      然后在一个 python 文件中,我应用了一些 pandas 功能来操作数据框。

      import pandas as pd
      
      df = pd.read_json("data.json")
      df['reviewId'] = df['reviews']['reviewId']
      df['authorName'] = df['reviews']['authorName']
      df['userComment_text'] = df['reviews']['comments']['userComment']['text']
      df['userComment_lastModified'] = df['reviews']['comments']['userComment']['lastModified']['seconds']
      df['starRating'] = df['reviews']['comments']['userComment']['starRating']
      df['deviceMetadata.manufacturer'] = df['reviews']['comments']['userComment']['deviceMetadata']['manufacturer']
      df['developerComment.text'] = df['reviews']['comments']['developerComment']['text']
      
      
      
      print(df.head())
      

      这是我的输出:

                                                                 reviews  ...                              developerComment.text
      authorName                                         Muhammad Arifin  ...  Terima kasih sudah berbagi, kami sangat senang...
      comments         {'userComment': {'text': ' sangat terbantu?', ...  ...  Terima kasih sudah berbagi, kami sangat senang...
      reviewId                                                      12a3  ...  Terima kasih sudah berbagi, kami sangat senang...
      tokenPagination                           {'nextPageToken': 'abc'}  ...  Terima kasih sudah berbagi, kami sangat senang...
      

      同时,您可以根据需要更改行。我没有编辑它们,因为您没有提供有关行的任何信息。

      希望对你有用

      【讨论】:

      • 嵌套数组部分会产生问题。 “评论”:[{}]
      猜你喜欢
      • 2021-09-26
      • 1970-01-01
      • 2020-09-07
      • 1970-01-01
      • 2022-01-09
      • 2021-05-31
      • 2020-02-14
      • 2016-09-03
      • 2021-10-30
      相关资源
      最近更新 更多