【问题标题】:Pandas not getting data from JSON API properlyPandas 未正确从 JSON API 获取数据
【发布时间】:2020-09-23 18:46:40
【问题描述】:

我正在尝试将数据从 JSON API 获取到 Pandas Dataframe。但是,Pandas 没有正确读取数据。以下是我的代码和输出:

import pandas as pd
import requests
r = requests.get('https://api.covid19india.org/raw_data5.json')
j = r.json()
df = pd.DataFrame.from_dict(j)

但是,我得到的输出不正确

raw_data
0   {'agebracket': '', 'contractedfromwhichpatient...
1   {'agebracket': '', 'contractedfromwhichpatient...
2   {'agebracket': '', 'contractedfromwhichpatient...
3   {'agebracket': '', 'contractedfromwhichpatient...
4   {'agebracket': '', 'contractedfromwhichpatient...

当我运行df.info() 时,我得到:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 1 columns):
raw_data    20409 non-null object
dtypes: object(1)
memory usage: 159.5+ KB

谁能帮我解决这个问题?

【问题讨论】:

  • 使用j = r.json()['raw_data']

标签: python python-3.x pandas python-2.7 data-science


【解决方案1】:

请尝试:

df = df['raw_data'].apply(pd.Series)
df.info()

输出

 <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 20409 entries, 0 to 20408
    Data columns (total 20 columns):
    agebracket                             20409 non-null object
    contractedfromwhichpatientsuspected    20409 non-null object
    currentstatus                          20409 non-null object
    dateannounced                          20409 non-null object
    detectedcity                           20409 non-null object
    detecteddistrict                       20409 non-null object
    detectedstate                          20409 non-null object
    entryid                                20409 non-null object
    gender                                 20409 non-null object
    nationality                            20409 non-null object
    notes                                  20409 non-null object
    numcases                               20409 non-null object
    patientnumber                          20409 non-null object
    source1                                20409 non-null object
    source2                                20409 non-null object
    source3                                20409 non-null object
    statecode                              20409 non-null object
    statepatientnumber                     20409 non-null object
    statuschangedate                       20409 non-null object
    typeoftransmission                     20409 non-null object
    dtypes: object(20)
    memory usage: 3.1+ MB

【讨论】:

    【解决方案2】:

    使用 j = r.json()['raw_data'] 从 json 中选择 raw_data 键。

    df.info()
    

    输出:

    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 20409 entries, 0 to 20408
    Data columns (total 20 columns):
     #   Column                               Non-Null Count  Dtype 
    ---  ------                               --------------  ----- 
     0   agebracket                           20409 non-null  object
     1   contractedfromwhichpatientsuspected  20409 non-null  object
     2   currentstatus                        20409 non-null  object
     3   dateannounced                        20409 non-null  object
     4   detectedcity                         20409 non-null  object
     5   detecteddistrict                     20409 non-null  object
     6   detectedstate                        20409 non-null  object
     7   entryid                              20409 non-null  object
     8   gender                               20409 non-null  object
     9   nationality                          20409 non-null  object
     10  notes                                20409 non-null  object
     11  numcases                             20409 non-null  object
     12  patientnumber                        20409 non-null  object
     13  source1                              20409 non-null  object
     14  source2                              20409 non-null  object
     15  source3                              20409 non-null  object
     16  statecode                            20409 non-null  object
     17  statepatientnumber                   20409 non-null  object
     18  statuschangedate                     20409 non-null  object
     19  typeoftransmission                   20409 non-null  object
    dtypes: object(20)
    memory usage: 3.1+ MB
    

    【讨论】:

      猜你喜欢
      • 2021-12-07
      • 1970-01-01
      • 2018-02-11
      • 1970-01-01
      • 2016-12-22
      • 2015-07-25
      • 1970-01-01
      • 2020-05-05
      • 1970-01-01
      相关资源
      最近更新 更多