【问题标题】:Converting objects into Panda dataframe?将对象转换为 Panda 数据框?
【发布时间】:2018-01-30 09:44:23
【问题描述】:

我有一个带有列的数据框

  • created_at
  • 身份证
  • 数据(我无法通过此列进行解析)

数据列中的每个对象都是一个字典。我想让字典中的每个对象都是一个独立的列。任何对包的帮助或指导将不胜感激。

下面是数据框中一个对象的样子。

pd.Series(data.data[1])

backers_count                                                              37
blurb                       Nano Art will make and market customized piece...
category                    {'id': 21, 'name': 'Digital Art', 'slug': 'art...
converted_pledged_amount                                                 1974
country                                                                    US
created_at                                                         1332823105
creator                     {'id': 300795038, 'name': 'Sameer Walavalkar',...
currency                                                                  USD
currency_symbol                                                             $
currency_trailing_code                                                   True
current_currency                                                          USD
deadline                                                           1337287105
disable_communication                                                   False
fx_rate                                                                     1
goal                                                                     5000
id                                                                  120596924
is_starrable                                                            False
launched_at                                                        1333399105
location                    {'id': 2468964, 'name': 'Pasadena', 'slug': 'p...
name                                                       Nano Art: Reloaded
photo                       {'ed': 'https://ksr-ugc.imgix.net/assets/011/3...
pledged                                                                  1974
profile                     {'id': 118685, 'name': None, 'blurb': None, 's...
slug                                                        nano-art-reloaded
source_url                  https://www.kickstarter.com/discover/categorie...
spotlight                                                               False
staff_pick                                                               True
state                                                                  failed
state_changed_at                                                   1337287105
static_usd_rate                                                             1
urls                        {'web': {'project': 'https://www.kickstarter.c...
usd_pledged                                                            1974.0
usd_type                                                             domestic

data.data[1]
Out[61]: 
{'backers_count': 37,
 'blurb': 'Nano Art will make and market customized pieces, in a variety of materials, featuring etchings smaller than an eyelash.',
 'category': {'color': 16760235,
  'id': 21,
  'name': 'Digital Art',
  'parent_id': 1,
  'position': 3,
  'slug': 'art/digital art',
  'urls': {'web': {'discover': 'http://www.kickstarter.com/discover/categories/art/digital%20art'}}},
 'converted_pledged_amount': 1974,
 'country': 'US',
 'created_at': 1332823105,
 'creator': {'avatar': {'medium': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
   'small': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
   'thumb': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=40&h=40&fit=crop&v=1461381464&auto=format&q=92&s=a4881fe982e2d57b041e2a591cd3e04e'},
  'chosen_currency': None,
  'id': 300795038,
  'is_registered': True,
  'name': 'Sameer Walavalkar',
  'urls': {'api': {'user': 'https://api.kickstarter.com/v1/users/300795038?signature=1515881698.259ca61a8b86731ffbea53f82f4a05e8d1d9f965'},
   'web': {'user': 'https://www.kickstarter.com/profile/300795038'}}},
 'currency': 'USD',
 'currency_symbol': '$',
 'currency_trailing_code': True,
 'current_currency': 'USD',
 'deadline': 1337287105,
 'disable_communication': False,
 'fx_rate': 1,
 'goal': 5000,
 'id': 120596924,
 'is_starrable': False,
 'launched_at': 1333399105,
 'location': {'country': 'US',
  'displayable_name': 'Pasadena, CA',
  'id': 2468964,
  'is_root': False,
  'localized_name': 'Pasadena',
  'name': 'Pasadena',
  'short_name': 'Pasadena, CA',
  'slug': 'pasadena-ca-us',
  'state': 'CA',
  'type': 'Town',
  'urls': {'api': {'nearby_projects': 'https://api.kickstarter.com/v1/discover?signature=1515876022.17e1296a181b009b97c854cfded1e99beeefd9fa&woe_id=2468964'},
   'web': {'discover': 'https://www.kickstarter.com/discover/places/pasadena-ca-us',
    'location': 'https://www.kickstarter.com/locations/pasadena-ca-us'}}},
 'name': 'Nano Art: Reloaded',
 'photo': {'1024x576': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1024&h=576&fit=crop&v=1463681196&auto=format&q=92&s=4fe4bf0d8a75fa43bbf253f8c1eb5710',
  '1536x864': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1552&h=873&fit=crop&v=1463681196&auto=format&q=92&s=76cdc06919b3b8df0f3daae78ab57301',
  'ed': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=352&h=198&fit=crop&v=1463681196&auto=format&q=92&s=4446e512c7a07794efcf131e35eb0111',
  'full': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=560&h=315&fit=crop&v=1463681196&auto=format&q=92&s=1d312059cce791c9058255f83c123f47',
  'key': 'assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg',
  'little': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=208&h=117&fit=crop&v=1463681196&auto=format&q=92&s=c386ed08b0c603e1912b1620f9bb58d6',
  'med': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=272&h=153&fit=crop&v=1463681196&auto=format&q=92&s=51a7ae3eadcb187a8dd58c14396b0d8c',
  'small': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=160&h=90&fit=crop&v=1463681196&auto=format&q=92&s=9b1a141c6269c9940f44e273f416b73e',
  'thumb': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=48&h=27&fit=crop&v=1463681196&auto=format&q=92&s=73ef45a6f4df337a942b3a75fec87996'},
 'pledged': 1974,
 'profile': {'background_color': None,
  'background_image_opacity': 0.8,
  'blurb': None,
  'feature_image_attributes': {'image_urls': {'baseball_card': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=560&h=315&fit=crop&v=1463681196&auto=format&q=92&s=1d312059cce791c9058255f83c123f47',
    'default': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1552&h=873&fit=crop&v=1463681196&auto=format&q=92&s=76cdc06919b3b8df0f3daae78ab57301'}},
  'id': 118685,
  'link_background_color': None,
  'link_text': None,
  'link_text_color': None,
  'link_url': None,
  'name': None,
  'project_id': 118685,
  'should_show_feature_image_section': True,
  'show_feature_image': False,
  'state': 'inactive',
  'state_changed_at': 1425915807,
  'text_color': None},
 'slug': 'nano-art-reloaded',
 'source_url': 'https://www.kickstarter.com/discover/categories/art/digital%20art',
 'spotlight': False,
 'staff_pick': True,
 'state': 'failed',
 'state_changed_at': 1337287105,
 'static_usd_rate': 1,
 'urls': {'web': {'project': 'https://www.kickstarter.com/projects/300795038/nano-art-reloaded?ref=category_newest',
   'rewards': 'https://www.kickstarter.com/projects/300795038/nano-art-reloaded/rewards'}},
 'usd_pledged': '1974.0',
 'usd_type': 'domestic'}

我尝试转置数据帧并使用 for 循环来堆叠 pd.Series 生成的第二列。但它不起作用。

【问题讨论】:

    标签: python pandas parsing dictionary dataframe


    【解决方案1】:

    使用json_normalize:

    from pandas.io.json import json_normalize
    
    df = json_normalize(data.data[1])
    

    print (df)
       backers_count                                              blurb  \
    0             37  Nano Art will make and market customized piece...   
    
       category.color  category.id category.name  category.parent_id  \
    0        16760235           21   Digital Art                   1   
    
       category.position    category.slug  \
    0                  3  art/digital art   
    
                              category.urls.web.discover  \
    0  http://www.kickstarter.com/discover/categories...   
    
       converted_pledged_amount    ...     \
    0                      1974    ...      
    
                                              source_url  spotlight staff_pick  \
    0  https://www.kickstarter.com/discover/categorie...      False       True   
    
        state state_changed_at static_usd_rate  \
    0  failed       1337287105               1   
    
                                        urls.web.project  \
    0  https://www.kickstarter.com/projects/300795038...   
    
                                        urls.web.rewards usd_pledged  usd_type  
    0  https://www.kickstarter.com/projects/300795038...      1974.0  domestic  
    
    [1 rows x 84 columns]
    

    【讨论】:

    • 非常感谢!我仍然习惯于来自 R 的 panda/python。感谢帮助。
    【解决方案2】:

    这里是你的字典示例的一个子集:

    d = {
        'backers_count':
        37,
        'blurb':
        'Nano Art will make and market customized pieces, in a variety of materials, featuring etchings smaller than an eyelash.',
        'category': {
            'color': 16760235,
            'id': 21,
            'name': 'Digital Art',
            'parent_id': 1,
            'position': 3,
            'slug': 'art/digital art',
            'urls': {
                'web': {
                    'discover':
                    'http://www.kickstarter.com/discover/categories/art/digital%20art'
                }
            }
        },
        'converted_pledged_amount':
        1974,
        'country':
        'US',
        'created_at':
        1332823105,
        'creator': {
            'avatar': {
                'medium':
                'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
                'small':
                'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
                'thumb':
                'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=40&h=40&fit=crop&v=1461381464&auto=format&q=92&s=a4881fe982e2d57b041e2a591cd3e04e'
            },
            'chosen_currency': None,
            'id': 300795038,
            'is_registered': True,
            'name': 'Sameer Walavalkar',
            'urls': {
                'api': {
                    'user':
                    'https://api.kickstarter.com/v1/users/300795038?signature=1515881698.259ca61a8b86731ffbea53f82f4a05e8d1d9f965'
                },
                'web': {
                    'user': 'https://www.kickstarter.com/profile/300795038'
                }
            }
        }
    }
    

    我创建了一个最小的例子:

    df = pd.DataFrame({"data":[d, d]})
    

    如果您想应用从字典到 DataFrame 的转换,您可以使用 map 函数:

    list_df = df.data.map(lambda d : pd.DataFrame.from_dict(d, orient="index").transpose()).tolist()
    

    然后,您可以连接结果:

    df_concat = pd.concat(list_df)
    

    在此操作之后,您可以连接原始 DataFrame datadf_concat

    【讨论】:

      猜你喜欢
      • 2023-03-14
      • 1970-01-01
      • 2021-05-02
      • 2019-09-30
      • 2019-04-21
      • 2021-11-18
      • 2022-01-15
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多