【问题标题】:How to create a nested JSON from pandas DataFrame?如何从 pandas DataFrame 创建嵌套 JSON?
【发布时间】:2021-03-24 17:06:03
【问题描述】:

我正在尝试从 DataFrame 生成嵌套 JSON,其中汽车的属性分布在几行中。

数据帧

cars = {'brand': ['Honda','Toyota','Ford','Audi','Honda','Toyota','Ford','Audi'],
        'model': ['Civic','Corolla','Focus','A4','Civic','Corolla','Focus','A4'],
        'attributeName': ['color','color','color','color','doors','doors','doors','doors'],
        'attributeValue': ['red','blue','black','red',2,4,4,2]
        }

df = pd.DataFrame(cars) 

我尝试了什么

起初我将行分组并尝试应用嵌套:

df.groupby(['brand','model'])\
             .apply(lambda x: x[['attributeName','attributeValue']].to_dict('records'))\
             .to_json(orient='records')

结果

[[{"attributeName":"color","attributeValue":"red"},{"attributeName":"doors","attributeValue":2}],[{"attributeName":"color","attributeValue":"black"},{"attributeName":"doors","attributeValue":4}],[{"attributeName":"color","attributeValue":"red"},{"attributeName":"doors","attributeValue":2}],[{"attributeName":"color","attributeValue":"blue"},{"attributeName":"doors","attributeValue":4}]]

预期结果

[
    {
        'brand':'Honda',
        'model':'Civic',
        'attributes':[
            {
                'name':'color',
                'value':'red'
            }
        ]
    },
    {...}
]

那么我能做些什么来获取其他记录而不仅仅是属性呢?

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    在您的解决方案中添加renamereset_index()

    d = {'attributeName':'name','attributeValue':'value'}
    j = df.rename(columns=d).groupby(['brand','model']).apply(lambda x: x[['name','value']].to_dict('records')).reset_index(name='attributes').to_json(orient='records')
    print (j)
    [{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"},{"name":"doors","value":2}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"},{"name":"doors","value":4}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"},{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"},{"name":"doors","value":4}]}]
    

    或者:

    d = {'attributeName':'name','attributeValue':'value'}
    j = df.rename(columns=d).groupby(['brand','model']).apply(lambda x: x[['name','value']].to_dict('records')).explode().apply(lambda x: [x]).reset_index(name='attributes').to_json(orient='records')
    print (j)
    [{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"}]},{"brand":"Audi","model":"A4","attributes":[{"name":"doors","value":2}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"doors","value":4}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"doors","value":4}]}]
    

    df['attributes'] = df.apply(lambda x: [{'name': x['attributeName'], 'value': x['attributeValue']}], axis=1)
    df = df.drop(['attributeName','attributeValue'], axis=1)
    print (df)
        brand    model                             attributes
    0   Honda    Civic    [{'name': 'color', 'value': 'red'}]
    1  Toyota  Corolla   [{'name': 'color', 'value': 'blue'}]
    2    Ford    Focus  [{'name': 'color', 'value': 'black'}]
    3    Audi       A4    [{'name': 'color', 'value': 'red'}]
    4   Honda    Civic        [{'name': 'doors', 'value': 2}]
    5  Toyota  Corolla        [{'name': 'doors', 'value': 4}]
    6    Ford    Focus        [{'name': 'doors', 'value': 4}]
    7    Audi       A4        [{'name': 'doors', 'value': 2}]
    
    j = df.to_json(orient='records')
    print (j)
    [{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"}]},{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"doors","value":4}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"doors","value":4}]},{"brand":"Audi","model":"A4","attributes":[{"name":"doors","value":2}]}]
    

    【讨论】:

    • 这看起来很棒并且正在运行 - 选择第一个选项,因为我不确定 explode() 目前是如何工作的,但会更仔细地研究它。
    • @HedgeHog - 我有点不确定,预期输出如何,第一个解决方案得到不同的输出,如第二个和第三个。
    猜你喜欢
    • 2014-02-24
    • 1970-01-01
    • 1970-01-01
    • 2020-12-26
    • 2015-06-20
    • 2017-05-08
    • 2020-12-22
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多