【问题标题】:How to iterate over all keys of a json with nested dictionaries and lists?如何使用嵌套字典和列表遍历 json 的所有键?
【发布时间】:2025-12-13 19:05:03
【问题描述】:

我需要修改下面的json文件,test.json:

{
  "install": {
    "site": {
      "acls": {
        "dns": {
          "authorized_ports": ["53:tcp", "53:udp"]
        }
      },
      "network": {
        "clusters": {
          "__ip_range_1__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          },
          "__ip_range_2__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          }
        }
      }
    }
  }
}

以上是缩写,原文件中还有很多条目。我每个站点都有几个这样的文件,因此__ip_range_x__ 在每个文件中都不同,每个 IP 也不同。我需要为每个 __ip_range_x__ 元素添加条目。新条目是存储在 mod.json 中的字典字典(下面为interface_config):

{
  "path": "{install}{site}{network}{clusters}{*}",
  "install" :  {
    "site": {
      "network": {
        "clusters": {
          "__iprange": {
            "interface_config": {
              "framesize": "1500",
              "framesize_vm": "1500"
            }
          }
        }
      }
    }
  }
}

我还需要在原始 json 文件的不同部分添加其他条目。

现在,我只是尝试遍历 test.json 中的所有元素。最终,我想为 test.json 中的每个元素构建一个路径,并将其与 mod.json 中的路径匹配以修改 test.json。但是,我无法在原始文件中打印所有元素。我当前的代码:

import json
import pprint

def traverse(d, path=None):
    if path is None:
        path = []
    for item,val  in d.iteritems():
        if isinstance(item, dict):
            for k,v in item.iteritems():
                print k
                traverse(v)
        elif isinstance(item, list):
            for j in item:
                (traverse(j))
        else:
            print item
        if isinstance(val, dict):
            for k,v in val.iteritems():
                print k
                traverse(v)
        elif isinstance(val, list):
            for j in val:
                (traverse(j))
with open("test.json", "r") as jf:
    data = json.load(jf)
traverse(data)

上面的输出是:

$ ./now.py
install
site
acls
dns
authorized_ports
Traceback (most recent call last):
  File "./now.py", line 51, in <module>
    traverse(data)
  File "./now.py", line 23, in traverse
    traverse(v)
  File "./now.py", line 23, in traverse
    traverse(v)
  File "./now.py", line 26, in traverse
    (traverse(j))
  File "./now.py", line 9, in traverse
    for item,val  in d.iteritems():
AttributeError: 'unicode' object has no attribute 'iteritems'

我知道我的第一个iteritems 电话是在错误的地方,但我就是想不通递归...任何指针表示赞赏。顺便说一句,我正在使用 Python 2。

编辑

我尝试处理的实际 json 比上面列出的更复杂。这是一个经过编辑的版本:

{
  "install": {
    "site": {
      "acls": {
        "dns": {
          "authorized_ports": ["53:tcp", "53:udp"]
        }
      },
      "network": {
        "clusters": {
          "__ip_range_1__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          },
          "__ip_range_2__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          }
        }
      }
    }
  }
  "config": {
    "ippool": [
      {
        "pool_name": "/ippool1",
        "pool_description": "IP Pool1",
        "ranges": [["__ip__", "__ip__"]]
      },
      {
        "pool_name": "/ippool2",
        "pool_description": "IP Pool2",
        "ranges": [["__ip__", "__ip__"]]
      }
    ],
    "storage": [
      {
        "account": "/root",
        "credentials": {
          "account": "admin",
          "service": "storage",
          "user": "admin",
          "password": "pass"
        }
      }
    ]

  }
}

我修改了 Paul Panzer 的答案以包括如下列表:

def traverse(d, path=[]):
    for k, v  in d.iteritems():
        yield path + [k], v
        if isinstance(v, dict):
            for k,v in traverse(v, path + [k]):
                yield k,v
        elif isinstance(v, list):
            for k in v:
                traverse(k, path + [])

但是上面的内容并没有打印 ippool 和 storage 列表中的元素。一旦遇到字典列表,由于某种原因不会对其进行遍历。

【问题讨论】:

  • Keys (item) 不能是lists 或dicts,所以你的第一个条件是没用的。当valdict 时,您假设vdict,而不是True,因此您的错误。当valdict 时,您应该只递归traverse(val),而不是遍历val
  • 所以你的 test.json 在顶层只有一个键('install')。您的 __ip_range_x 的“路径”是否始终相同?例如总是在“安装”>“站点”>“网络”>“集群”中?
  • @MaxPower 我的 test.json 顶部有许多条目,一些 dict,一些 dict 的 dict 和更复杂的包括许多级别的 dict 和列表。但是,__ip_range_x 的路径始终相同。
  • 好的,那么,AChampion 的回答对你有用吗?
  • @MaxPower 它产生太多的值来解压错误。

标签: python json recursion python-2.x


【解决方案1】:

这是您的traverse 例程的清理版本。它只是遍历一个嵌套的字典/列表;为了清楚起见,我已经削减了其他所有内容。希望对您有所帮助。

master = {
  "install": {
    "site": {
      "acls": {
        "dns": {
          "authorized_ports": ["53:tcp", "53:udp"]
        }
      },
      "network": {
        "clusters": {
          "__ip_range_1__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          },
          "__ip_range_2__": {
            "dhcpstart": "__ip__",
            "dhcpend": "__ip__",
            "adminip": "__ip__"
          }
        }
      }
    }
  },
  "config": {
    "ippool": [
      {
        "pool_name": "/ippool1",
        "pool_description": "IP Pool1",
        "ranges": [["__ip__", "__ip__"]]
      },
      {
        "pool_name": "/ippool2",
        "pool_description": "IP Pool2",
        "ranges": [["__ip__", "__ip__"]]
      }
    ],
    "storage": [
      {
        "account": "/root",
        "credentials": {
          "account": "admin",
          "service": "storage",
          "user": "admin",
          "password": "pass"
        }
      }
    ]

  }
}

def traverse(dict_or_list, path=[]):
    if isinstance(dict_or_list, dict):
        iterator = dict_or_list.iteritems()
    else:
        iterator = enumerate(dict_or_list)
    for k, v in iterator:
        yield path + [k], v
        if isinstance(v, (dict, list)):
            for k, v in traverse(v, path + [k]):
                yield k, v

for path, node in traverse(master):
    print path

输出:

['config']
['config', 'ippool']
['config', 'ippool', 0]
['config', 'ippool', 0, 'ranges']
['config', 'ippool', 0, 'ranges', 0]
['config', 'ippool', 0, 'ranges', 0, 0]
['config', 'ippool', 0, 'ranges', 0, 1]
['config', 'ippool', 0, 'pool_name']
['config', 'ippool', 0, 'pool_description']
['config', 'ippool', 1]
['config', 'ippool', 1, 'ranges']
['config', 'ippool', 1, 'ranges', 0]
['config', 'ippool', 1, 'ranges', 0, 0]
['config', 'ippool', 1, 'ranges', 0, 1]
['config', 'ippool', 1, 'pool_name']
['config', 'ippool', 1, 'pool_description']
['config', 'storage']
['config', 'storage', 0]
['config', 'storage', 0, 'credentials']
['config', 'storage', 0, 'credentials', 'account']
['config', 'storage', 0, 'credentials', 'password']
['config', 'storage', 0, 'credentials', 'user']
['config', 'storage', 0, 'credentials', 'service']
['config', 'storage', 0, 'account']
['install']
['install', 'site']
['install', 'site', 'acls']
['install', 'site', 'acls', 'dns']
['install', 'site', 'acls', 'dns', 'authorized_ports']
['install', 'site', 'acls', 'dns', 'authorized_ports', 0]
['install', 'site', 'acls', 'dns', 'authorized_ports', 1]
['install', 'site', 'network']
['install', 'site', 'network', 'clusters']
['install', 'site', 'network', 'clusters', '__ip_range_2__']
['install', 'site', 'network', 'clusters', '__ip_range_2__', 'dhcpend']
['install', 'site', 'network', 'clusters', '__ip_range_2__', 'adminip']
['install', 'site', 'network', 'clusters', '__ip_range_2__', 'dhcpstart']
['install', 'site', 'network', 'clusters', '__ip_range_1__']
['install', 'site', 'network', 'clusters', '__ip_range_1__', 'dhcpend']
['install', 'site', 'network', 'clusters', '__ip_range_1__', 'adminip']
['install', 'site', 'network', 'clusters', '__ip_range_1__', 'dhcpstart']

【讨论】:

  • 谢谢,上面的效果很好。但是,我未能展示我的实际 json 到底有多复杂。它还包含嵌套字典的列表 - 因此我也尝试处理这些字典。我将使用扩展的 json 更新我的问题。如果您也可以添加处理列表的代码,我很乐意接受您的回答。
  • @Lidia 添加了列表。让我知道这是否是您的想法。
【解决方案2】:

您似乎想让这变得比必要的更复杂:

with open("test.json", "r") as jf:
    data = json.load(jf)
with open("mod.json", "r") as mf:
    mod = json.load(mf)

ip_ranges = data['install']['site']['network']['clusters']
for rng, val in mod['install']['site']['network']['clusters'].items():
    data[rng]["interface_config"] = val["interface_config"]

【讨论】:

  • 执行上述操作时出现以下错误:for rng, val in mod['install']['site']['network']['clusters']: ValueError: too许多值要解压 - 是因为 val 是一个字典吗?
  • 抱歉错过了.items() 这是dict
  • 这仍然不起作用,出现以下错误:KeyError: u'__iprange'。这是两个 json,original 和 mod,不匹配的地方。此外,上面代码中的ip_ranges 被分配了一个值,但从未使用过。目的是什么?
  • 如果我将分配更改为data['install']['site']['network']['clusters'][rng]["interface_config"] = new_value,它会起作用。我将添加更正后的代码作为单独的答案,因为数据和 mod 也被颠倒了。
【解决方案3】:

来自 AChampion 的修改代码,用于在具有动态键的字典的所有元素下添加条目:

with open(args.masterjson, "r") as masterjf:
    data = json.load(masterjf)
with open(args.modjson, "r") as modjf:
    mod = json.load(modjf)

new_value = mod['install']['site']['network']['clusters']['__iprange']['interface_config']
for rng, val in data['install']['site']['network']['clusters'].items():
    data['install']['site']['network']['clusters'][rng]["interface_config"] = new_value

上面在集群中的每个 ip 范围下添加了一个新的字典字典。输出:

master BEFORE update:
{u'dhcpend': u'__ip__', u'adminip': u'__ip__', u'dhcpstart': u'__ip__'}
{u'dhcpend': u'__ip__', u'adminip': u'__ip__', u'dhcpstart': u'__ip__'}
master AFTER update:
{'interface_config': {u'framesize_vm': u'1500', u'framesize': u'1500'}, u'dhcpend': u'__ip__', u'adminip': u'__ip__', u'dhcpstart': u'__ip__'}
{'interface_config': {u'framesize_vm': u'1500', u'framesize': u'1500'}, u'dhcpend': u'__ip__', u'adminip': u'__ip__', u'dhcpstart': u'__ip__'}

【讨论】: