遍历python中的嵌套列表/字典答案

【问题标题】：Iterating through nested list/dictionary in python遍历python中的嵌套列表/字典
【发布时间】：2021-03-19 08:02:10
【问题描述】：

我正在尝试解析 yaml 文件 - https://github.com/open-telemetry/opentelemetry-specification/blob/master/semantic_conventions/resource/cloud.yaml

我正在使用以下代码

with open('cloud.yaml') as f:
    my_dict = yaml.safe_load(f)

print(my_dict)

生成以下字典

{'groups': [{'id': 'cloud', 'prefix': 'cloud', 'brief': 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n', 'attributes': [{'id': 'provider', 'type': {'allow_custom_values': True, 'members': [{'id': 'AWS', 'value': 'aws', 'brief': 'Amazon Web Services'}, {'id': 'Azure', 'value': 'azure', 'brief': 'Microsoft Azure'}, {'id': 'GCP', 'value': 'gcp', 'brief': 'Google Cloud Platform'}]}, 'brief': 'Name of the cloud provider.\n', 'examples': 'gcp'}, {'id': 'account.id', 'type': 'string', 'brief': 'The cloud account ID used to identify different entities.\n', 'examples': ['opentelemetry']}, {'id': 'region', 'type': 'string', 'brief': 'A specific geographical location where different entities can run.\n', 'examples': ['us-central1']}, {'id': 'zone', 'type': 'string', 'brief': 'Zones are a sub set of the region connected through low-latency links.\n', 'note': 'In AWS, this is called availability-zone.\n', 'examples': ['us-central1-a']}]}]}

我想遍历元素并提取以下值

id - 云
所有属性 -> id - 提供者； id - account.id ; id-区域； id - 区域
成员 - aws、azure、gcp

我正在尝试使用以下代码遍历所有键值

for groups in my_dict.values():
    print(groups)

输出是

[{'id': 'cloud', 'prefix': 'cloud', 'brief': 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n', 'attributes': [{'id': 'provider', 'type': {'allow_custom_values': True, 'members': [{'id': 'AWS', 'value': 'aws', 'brief': 'Amazon Web Services'}, {'id': 'Azure', 'value': 'azure', 'brief': 'Microsoft Azure'}, {'id': 'GCP', 'value': 'gcp', 'brief': 'Google Cloud Platform'}]}, 'brief': 'Name of the cloud provider.\n', 'examples': 'gcp'}, {'id': 'account.id', 'type': 'string', 'brief': 'The cloud account ID used to identify different entities.\n', 'examples': ['opentelemetry']}, {'id': 'region', 'type': 'string', 'brief': 'A specific geographical location where different entities can run.\n', 'examples': ['us-central1']}, {'id': 'zone', 'type': 'string', 'brief': 'Zones are a sub set of the region connected through low-latency links.\n', 'note': 'In AWS, this is called availability-zone.\n', 'examples': ['us-central1-a']}]}]

我想单独打印所有值，例如 - 云、云基础设施（例如 GCP、Azure、AWS）\n 等

我需要的输出是打印以下值：

cloud, A cloud infrastructure (e.g. GCP, Azure, AWS).
cloud.provider,, Name of the cloud provider.
cloud.provider.member, AWS, Amazon Web Services
cloud.provider.member, azure, Microsoft Azure
cloud.provider.member, GCP, Google Cloud Platform
cloud.account.id, string, The cloud account ID used to identify different entities.
cloud.region, string, A specific geographical location where different entities can run.    
.
.
.
.

【问题讨论】：

你能举一个你想把这些值放进去的结构/输出的例子吗？您已经拥有了您要询问的所有数据，因此只需将其放入其他形状即可。
@Samwise 感谢您的回复。我很难遍历输出字典。这看起来像一个带有子列表/字典的嵌套字典。 print(my_dict["groups"][0]["id"]) 将输出作为云输出
@Samwise 我用更多细节更新了问题
请提供一个您期望从您已经提供的输入中获得的输出示例。
@Grismar - 感谢您的帮助。我在问题中添加了预期的输出部分

标签： python python-3.x yaml pyyaml

【解决方案1】：

也可以用通用的方式实现，验证'type'中的值是否是dict实例：

假设变量parsed_dict有解析jaml文件后的结果：

def remove_end_of_line_char(line_text):
    if len(line_text) > 0 and line_text[-1] == '\n':
        line_text = line_text[:-1]

    return line_text


data_groups = parsed_dict["groups"]
for group in data_groups:
    msg = remove_end_of_line_char(f"{group['id']}, {group['brief']}")
    print(msg)
    attributes_list = group["attributes"]
    for attribute in attributes_list:
        attr_type = attribute['type']
        if isinstance(attr_type, dict):
            print(f"{group['id']}.{attribute['id']},, {remove_end_of_line_char(attribute['brief'])}")
            cloud_provider_member_prefix = f"{group['id']}.{attribute['id']}.member, "
            for member in attr_type['members']:
                print(f"{cloud_provider_member_prefix}{member['id']}, {member['brief']}")
        else:
            msg = remove_end_of_line_char(f"{group['id']}.{attribute['id']}, {attribute['type']}, {attribute['brief']}")
            print(msg)

【讨论】：

【解决方案2】：

这是您的输出字典。我让它可读

myDict = {
'groups': [
    {
        'id': 'cloud', 
        'prefix': 'cloud', 
        'brief': 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n', 
        'attributes': [
            {
                'id': 'provider', 
                'type': {
                    'allow_custom_values': True, 
                    'members': [
                        {
                            'id': 'AWS', 
                            'value': 'aws', 
                            'brief': 'Amazon Web Services'
                            
                        }, 
                        {
                            'id': 'Azure', 
                            'value': 'azure', 
                            'brief': 'Microsoft Azure'
                            
                        }, 
                        {
                            'id': 'GCP', 
                            'value': 'gcp', 
                            'brief': 'Google Cloud Platform'
                            
                        }
                    ]
                    
                }, 
                'brief': 'Name of the cloud provider.\n',
                'examples': 'gcp'
                
            }, 
            {
                'id': 'account.id', 
                'type': 'string', 
                'brief': 'The cloud account ID used to identify different entities.\n', 
                'examples': ['opentelemetry']}, 
                {
                    'id': 'region', 
                    'type': 'string', 
                    'brief': 'A specific geographical location where different entities can run.\n',
                    'examples': ['us-central1']
                    
                },
                {
                    'id': 'zone', 
                    'type': 'string', 
                    'brief': 'Zones are a sub set of the region connected through low-latency links.\n',
                    'note': 'In AWS, this is called availability-zone.\n',
                    'examples': ['us-central1-a']
                    
                }
        ]
    }
]

}

现在我们可以清楚地看到它了。

for v in myDict['groups'][0].items():
    print(v)

输出：

('id', 'cloud')
('prefix', 'cloud')
('brief', 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n')
('attributes', [{'id': 'provider', 'type': {'allow_custom_values': True, 'members': [{'id': 'AWS', 'value': 'aws', 'brief': 'Amazon Web Services'}, {'id': 'Azure', 'value': 'azure', 'brief': 'Microsoft Azure'}, {'id': 'GCP', 'value': 'gcp', 'brief': 'Google Cloud Platform'}]}, 'brief': 'Name of the cloud provider.\n', 'examples': 'gcp'}, {'id': 'account.id', 'type': 'string', 'brief': 'The cloud account ID used to identify different entities.\n', 'examples': ['opentelemetry']}, {'id': 'region', 'type': 'string', 'brief': 'A specific geographical location where different entities can run.\n', 'examples': ['us-central1']}, {'id': 'zone', 'type': 'string', 'brief': 'Zones are a sub set of the region connected through low-latency links.\n', 'note': 'In AWS, this is called availability-zone.\n', 'examples': ['us-central1-a']}])

现在像这样提取数据。但是你可以在一个 for 循环中获取所有值

data = myDict['groups'][0]
id = data['id']
brief = data['brief']
attr = data['attributes']
mems = attr[0]['type']['members']

print(f"{id},{brief})    

for member in mems:
    print(f"cloud.provider.member.{member['value']}, {member['brief']}")

输出：

cloud,A cloud infrastructure (e.g. GCP, Azure, AWS)

cloud.provider.member.aws, Amazon Web Services
cloud.provider.member.azure, Microsoft Azure
cloud.provider.member.gcp, Google Cloud Platform

【讨论】：