【问题标题】:how to properly read a json file using pandas如何使用 pandas 正确读取 json 文件
【发布时间】:2020-04-08 19:35:23
【问题描述】:

目前我正在开发一个应用程序来分析堆栈溢出问题。所以我从堆栈 API 获取数据作为 json 文件,并使用以下代码读取 json 文件中的数据,如下所示。

import pandas as pd
import json
df = pd.read_json("questions_sof.json")
df.head(3)

但是输出只接收到像 json 文件这样的分离数据。但实际上我想将它们添加到表中以手动分析数据,因为它在视觉上很容易处理。

questions
1   {'tags': ['r','loops','linear-regression'], 'owner': {'re...
2   {'tags': ['vb.net', 'winforms'], 'owner': {'re...

我尝试了一些代码,但无法找到一种以适当方式可视化表格中数据的方法。您能否建议我一种在表格中显示这些数据的正确方法,或者提供一些链接以找出我自己对这个问题的答案。 json 文件包含一个从堆栈溢出中提取的问题以进行分析,我已经给出了 json 文件的示例数据。

json文件内容:

 {"questions":[
   {
     "tags": [
       "r",
       "loops",
       "linear-regression"
     ],
     "owner": {
       "reputation": 23,
       "user_id": 13106013,
       "user_type": "registered",
       "profile_image":"https://www.gravatar.com/avatar/7cfd118a3deb280317d603fe02271ed9?s=128&d=identicon&r=PG",
       "display_name": "Pablo",
       "link": "https://stackoverflow.com/users/13106013/pablo"
     },
     "is_answered": false,
     "view_count": 1,
     "answer_count": 0,
     "score": 0,
     "last_activity_date": 1586211687,
     "creation_date": 1586211687,
     "question_id": 61069878,
     "link": "https://stackoverflow.com/questions/61069878/loop-for-multiple-linear-regression",
     "title": "Loop for multiple linear regression"
   },
   {
     "tags": [
       "vb.net",
       "winforms"
     ],
      "owner": {
       "reputation": 1,
       "user_id": 13242730,
       "user_type": "registered",
       "profile_image": "https://graph.facebook.com/1499587313549122/picture?type=large",
       "display_name": "Ante Petrovi\u0107",
       "link": "https://stackoverflow.com/users/13242730/ante-petrovi%c4%87"
     },
     "is_answered": false,
     "view_count": 9,
     "answer_count": 0,
     "score": 0,
     "last_activity_date": 1586211684,
     "creation_date": 1586210993,
     "last_edit_date": 1586211684,
     "question_id": 61069743,
     "link": "https://stackoverflow.com/questions/61069743/how-to-make-a-program-load-buttons-before-resizing-them",
     "title": "How to make a program load buttons before resizing them?"
   }
  ]
 }

【问题讨论】:

  • 请显示您的json文件内容
  • df = pd.read_json('questions_sof.json', orient='records')?
  • @Chris 不起作用。结果还是一样

标签: json python-3.x pandas


【解决方案1】:

这是您要查找的输出吗?

import json

f = """{"questions":[
   {
     "tags": [
       "r",
       "loops",
       "linear-regression"
     ],
     "owner": {
       "reputation": 23,
       "user_id": 13106013,
       "user_type": "registered",
       "profile_image":"https://www.gravatar.com/avatar/7cfd118a3deb280317d603fe02271ed9?s=128&d=identicon&r=PG",
       "display_name": "Pablo",
       "link": "https://stackoverflow.com/users/13106013/pablo"
     },
     "is_answered": false,
     "view_count": 1,
     "answer_count": 0,
     "score": 0,
     "last_activity_date": 1586211687,
     "creation_date": 1586211687,
     "question_id": 61069878,
     "link": "https://stackoverflow.com/questions/61069878/loop-for-multiple-linear-regression",
     "title": "Loop for multiple linear regression"
   },
   {
     "tags": [
       "vb.net",
       "winforms"
     ],
      "owner": {
       "reputation": 1,
       "user_id": 13242730,
       "user_type": "registered",
       "profile_image": "https://graph.facebook.com/1499587313549122/picture?type=large",
       "display_name": "Ante Petrovi\u0107",
       "link": "https://stackoverflow.com/users/13242730/ante-petrovi%c4%87"
     },
     "is_answered": false,
     "view_count": 9,
     "answer_count": 0,
     "score": 0,
     "last_activity_date": 1586211684,
     "creation_date": 1586210993,
     "last_edit_date": 1586211684,
     "question_id": 61069743,
     "link": "https://stackoverflow.com/questions/61069743/how-to-make-a-program-load-buttons-before-resizing-them",
     "title": "How to make a program load buttons before resizing them?"
   }
  ]
 }"""

# load json
j = json.loads(f)
# normalize json
df = pd.json_normalize(j['questions'])

【讨论】:

  • 没有。我想在表格中将它们表示为 json 数据以可视化 clearway。
  • 表是什么意思?我从 json 创建了一个数据框...向下滚动
  • 熊猫没有json_normalize() 方法。但我通过将相关库导入代码找到了答案。非常感谢。
  • @Anupa_sj 我不确定你使用的是什么版本的熊猫,但pandas.json_normalize 存在
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2017-10-03
  • 1970-01-01
  • 2021-12-28
  • 1970-01-01
  • 1970-01-01
  • 2019-09-04
  • 2019-10-27
相关资源
最近更新 更多