【问题标题】:Converting a sub OrderedDict to a DataFrame将子 OrderedDict 转换为 DataFrame
【发布时间】:2018-05-22 07:17:51
【问题描述】:

我正在使用 Jupyter 并已从 Airtable 的 API 访问数据。它现在存储为多个 OrderedDict。我需要将此数据转换为单独的数据帧。

OrderedDict([('records',
                  [OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
                                ('fields', OrderedDict()),
                                ('createdTime', '2018-05-18T05:36:54.000Z')]),
                   OrderedDict([('id', 'rec13WqEutT0SwIP0'),
                                ('fields',
                                 OrderedDict([('Lead ID', '64556'),
                                              ('Company Name',
                                               'CesKath (Ukay-Ukay) / KRKK Online Shop'),
                                              ('Client Name',
                                               'Kamille Rona Venturina Taytay'),
                                              ('Principal Defendant Name/s',
                                               'n/a'),
                                              ('Co-Defendant Name/s', 'n/a'),
                                              ('Plaintiff', 'n/a'),
                                              ('Nature of Case', 'n/a'),
                                              ('Trial Court', 'n/a'),
                                              ('City/Province', 'n/a'),
                                              ('Sala No.', 'n/a'),
                                              ('Case Number', 'n/a'),
                                              ('Case Status', 'n/a'),
                                              ('Address', 'n/a')])),

我尝试了以下代码,它将所有内容都转换为单个数据帧。

df = pd.DataFrame.from_dict(data)     

当我执行此代码时,它会产生以下内容:

     records                     offset
0   {'id': 'rec0O8L1dlrobrPtj', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
1   {'id': 'rec13WqEutT0SwIP0', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
2   {'id': 'rec22sGXgPU9hFbTq', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
3   {'id': 'rec2a4MQL24dQhGzI', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
4   {'id': 'rec3VBhG7u55BQsFy', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt

我需要在第三个缩进中访问 OrderedDict(即

                                              ('Lead ID', '64556'),
                                              ('Company Name',
                                               'CesKath (Ukay-Ukay) / KRKK Online Shop'),
                                              ('Client Name',
                                               'Kamille Rona Venturina Taytay'),
                                              ('Principal Defendant Name/s',
                                               'n/a'),
                                              ('Co-Defendant Name/s', 'n/a'),
                                              ('Plaintiff', 'n/a'),
                                              ('Nature of Case', 'n/a'),
                                              ('Trial Court', 'n/a'),
                                              ('City/Province', 'n/a'),
                                              ('Sala No.', 'n/a'),
                                              ('Case Number', 'n/a'),
                                              ('Case Status', 'n/a'),
                                              ('Address', 'n/a')])),

我如何才能访问 sub-OrderedDict 并将其转换为数据框?

【问题讨论】:

    标签: python dataframe ordereddictionary


    【解决方案1】:

    这是一种方法。

    演示:

    from collections import OrderedDict
    import pandas as pd
    
    data = OrderedDict([('records',
                      [OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
                                    ('fields', OrderedDict()),
                                    ('createdTime', '2018-05-18T05:36:54.000Z')]),
                       OrderedDict([('id', 'rec13WqEutT0SwIP0'),
                                    ('fields',
                                     OrderedDict([('Lead ID', '64556'),
                                                  ('Company Name',
                                                   'CesKath (Ukay-Ukay) / KRKK Online Shop'),
                                                  ('Client Name',
                                                   'Kamille Rona Venturina Taytay'),
                                                  ('Principal Defendant Name/s',
                                                   'n/a'),
                                                  ('Co-Defendant Name/s', 'n/a'),
                                                  ('Plaintiff', 'n/a'),
                                                  ('Nature of Case', 'n/a'),
                                                  ('Trial Court', 'n/a'),
                                                  ('City/Province', 'n/a'),
                                                  ('Sala No.', 'n/a'),
                                                  ('Case Number', 'n/a'),
                                                  ('Case Status', 'n/a'),
                                                  ('Address', 'n/a')]))])
                       ]
                  )])
    
    df = pd.DataFrame([d["fields"] for d in data["records"]])
    print(df)
    

    输出:

      Lead ID                            Company Name  \
    0     NaN                                     NaN   
    1   64556  CesKath (Ukay-Ukay) / KRKK Online Shop   
    
                         Client Name Principal Defendant Name/s  \
    0                            NaN                        NaN   
    1  Kamille Rona Venturina Taytay                        n/a   
    
      Co-Defendant Name/s Plaintiff Nature of Case Trial Court City/Province  \
    0                 NaN       NaN            NaN         NaN           NaN   
    1                 n/a       n/a            n/a         n/a           n/a   
    
      Sala No. Case Number Case Status Address  
    0      NaN         NaN         NaN     NaN  
    1      n/a         n/a         n/a     n/a  
    

    【讨论】:

      猜你喜欢
      • 2019-01-19
      • 2019-10-22
      • 1970-01-01
      • 1970-01-01
      • 2018-12-08
      • 2020-08-24
      • 2018-05-06
      • 2021-12-01
      相关资源
      最近更新 更多