【问题标题】:how to convert multiple layers of nested json to sql table如何将多层嵌套的json转换为sql表
【发布时间】:2017-03-20 06:42:20
【问题描述】:

在 StackOverflow 的帮助下,我能够做到这一点。需要更多帮助将 JSON 转换为 SQL 表。非常感谢任何帮助。

{
    "Volumes": [{
        "AvailabilityZone": "us-east-1a",
        "Attachments": [{
            "AttachTime": "2013-12-18T22:35:00.000Z",
            "InstanceId": "i-1234567890abcdef0",
            "VolumeId": "vol-049df61146c4d7901",
            "State": "attached",
            "DeleteOnTermination": true,
            "Device": "/dev/sda1",

            "Tags": [{
                "Value": "DBJanitor-Private",
                "Key": "Name"
            }, {
                "Value": "DBJanitor",
                "Key": "Owner"
            }, {
                "Value": "Database",
                "Key": "Product"
            }, {
                "Value": "DB Janitor",
                "Key": "Portfolio"
            }, {
                "Value": "DB Service",
                "Key": "Service"
            }]
        }],
            "Ebs": {
                                "Status": "attached",
                                "DeleteOnTermination": true,
                                "VolumeId": "vol-049df61146c4d7901",
                                "AttachTime": "2016-09-14T19:49:11.000Z"
                            },
        "VolumeType": "standard",
        "VolumeId": "vol-049df61146c4d7901"
    }]
}

在 StackOverFlow 的帮助下,我能够解决直到标签。不知道如何解决 Ebs 问题。我对编码很陌生,非常感谢任何帮助。

In [1]: fn = r'D:\temp\.data\40454898.json'

In [2]: with open(fn) as f:
   ...:     data = json.load(f)
   ...:

In [14]: t = pd.io.json.json_normalize(data['Volumes'],
    ...:                               ['Attachments','Tags'],
    ...:                               [['Attachments', 'VolumeId'],
    ...:                                ['Attachments', 'InstanceId']])
    ...:

In [15]: t
Out[15]:
         Key              Value Attachments.InstanceId   Attachments.VolumeId
0       Name  DBJanitor-Private    i-1234567890abcdef0  vol-049df61146c4d7901
1      Owner          DBJanitor    i-1234567890abcdef0  vol-049df61146c4d7901
2    Product           Database    i-1234567890abcdef0  vol-049df61146c4d7901
3  Portfolio         DB Janitor    i-1234567890abcdef0  vol-049df61146c4d7901
4    Service         DB Service    i-1234567890abcdef0  vol-049df61146c4d7901

谢谢

【问题讨论】:

    标签: python mysql json pandas


    【解决方案1】:

    json_normalize 需要一个 list 字典,如果是 Ebs - 它只是一个字典,所以我们应该预处理 JSON 数据:

    In [88]: with open(fn) as f:
        ...:     data = json.load(f)
        ...:
    
    In [89]: for r in data['Volumes']:
        ...:     if 'Ebs' not in r: # add 'Ebs' dict if it's not in the record...
        ...:         r['Ebs'] = []
        ...:     if not isinstance(r['Ebs'], list): # wrap 'Ebs' in a list if it's not a list 
        ...:         r['Ebs'] = [r['Ebs']]
        ...:
    
    In [90]: data
    Out[90]:
    {'Volumes': [{'Attachments': [{'AttachTime': '2013-12-18T22:35:00.000Z',
         'DeleteOnTermination': True,
         'Device': '/dev/sda1',
         'InstanceId': 'i-1234567890abcdef0',
         'State': 'attached',
         'Tags': [{'Key': 'Name', 'Value': 'DBJanitor-Private'},
          {'Key': 'Owner', 'Value': 'DBJanitor'},
          {'Key': 'Product', 'Value': 'Database'},
          {'Key': 'Portfolio', 'Value': 'DB Janitor'},
          {'Key': 'Service', 'Value': 'DB Service'}],
         'VolumeId': 'vol-049df61146c4d7901'}],
       'AvailabilityZone': 'us-east-1a',
       'Ebs': [{'AttachTime': '2016-09-14T19:49:11.000Z',
         'DeleteOnTermination': True,
         'Status': 'attached',
         'VolumeId': 'vol-049df61146c4d7901'}],
       'VolumeId': 'vol-049df61146c4d7901',
       'VolumeType': 'standard'}]}
    

    注意:'Ebs': {..} 已替换为 'Ebs': [{..}]

    In [91]: e = pd.io.json.json_normalize(data['Volumes'],
        ...:                               ['Ebs'],
        ...:                               ['VolumeId'],
        ...:                               meta_prefix='parent_')
        ...:
    
    
    In [92]: e
    Out[92]:
                     AttachTime DeleteOnTermination    Status               VolumeId        parent_VolumeId
    0  2016-09-14T19:49:11.000Z                True  attached  vol-049df61146c4d7901  vol-049df61146c4d7901
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-04-09
      • 1970-01-01
      • 2021-10-21
      • 1970-01-01
      • 2015-12-30
      • 2021-03-07
      • 1970-01-01
      相关资源
      最近更新 更多