这些嵌套的 for 循环如何变成单个循环？答案

【问题标题】：How can these nested for loops be turned into a single loop?这些嵌套的 for 循环如何变成单个循环？
【发布时间】：2021-05-22 09:39:54
【问题描述】：

我有这个清单：

ls = [[0, 'C', [1, 2, 3, 4], 'E', []],
      [1, 'C', [0, 5, 6, 7], 'E', []],
      [2, 'H', [0], '-', []],
      [3, 'H', [0], '-', []],
      [4, 'H', [0], '-', []],
      [5, 'H', [1], '-', []],
      [6, 'O', [1], 'X', []],
      [7, 'H', [1], '-', []]]

这代表一个分子。第一列只是一个数字，第二列是分子中的一个原子，第三列告诉这个原子与哪些原子结合。例如，原子 0 与原子 1、2、3 和 4 绑定。我想找出每个原子离氧的距离，这是信息存储到最后一列。所以输出应该是：

[[0, 'C', [1, 2, 3, 4], 'E', 2],
[1, 'C', [0, 5, 6, 7], 'E', 1],
[2, 'H', [0], '-', 3],
[3, 'H', [0], '-', 3],
[4, 'H', [0], '-', 3],
[5, 'H', [1], '-', 2],
[6, 'O', [1], 'X', 0],
[7, 'H', [1], '-', 2]]

我试过这个循环，它工作得很好：

def dists(data):
    new_data = data
    # Loop to find the distances from the X-atom:
    for sl2 in new_data:
        if sl2[3] == "X":
            sl2[4] = 0
            next1 = sl2[2]

    for number1, row1 in enumerate(new_data):
        for a1 in next1:
            if new_data[number1][0] == a1:
                if type(new_data[a1][4]) == list:
                    new_data[a1][4] = 1
                    next2 = new_data[a1][2]

                    for number2, row2 in enumerate(new_data):
                        for a2 in next2:
                            if new_data[number2][0] == a2:
                                if type(new_data[a2][4]) == list:
                                    new_data[a2][4] = 2
                                    next3 = new_data[a2][2]

                                    for number3, row3 in enumerate(new_data):
                                        for a3 in next3:
                                            if new_data[number3][0] == a3:
                                                if type(new_data[a3][4]) == list:
                                                    new_data[a3][4] = 3
                                                    next4 = new_data[a3][2]
                                                        #etc...
    return new_data

ls2 = dists(ls)

但现在我必须制作许多嵌套的 for 循环。如何将这些嵌套循环变成一个循环？

编辑@deadshot

第五列的值来自这些：

sl2[4] = 0
new_data[a1][4] = 1
new_data[a2][4] = 2
new_data[a3][4] = 3
etc...

在第四列中，“X”只是帮助我找到氧原子的起点。

【问题讨论】：

什么是第 4 列以及如何计算第 5 列的值？提供第 5 列的示例计算

标签： python loops

【解决方案1】：

在计算机科学术语中，您所拥有的就是所谓的图表。你的分子就是所谓的“节点”或“顶点”，它们之间的连接被称为“边”。您需要找到氧气与所有其他节点之间的距离。这可以通过所谓的Breadth first search 来完成（还有其他方法，但我认为这是最容易开始的方法）

我强烈建议您阅读关于此的维基百科页面，但这里有一个适用于您的数据结构的 python 版本：

from collections import deque


def bfs(ls):
    root = [a for a,b, *(_) in ls if b == 'O' ][0]
    ls[root][4] = 0 # mark oxygen as distance 0
    queue = deque([(root,0)])
    discovered = set([root])
    
    while queue:
        v,d = queue.popleft()
        for edge in ls[v][2]:
            if edge not in discovered:
                discovered.add(edge)
                queue.append((edge,d+1))
                ls[edge][4] = d+1 # add distance to new atom

像这样运行：

bfs(ls)

ls 运行后：

[[0, 'C', [1, 2, 3, 4], 'E', 2],
 [1, 'C', [0, 5, 6, 7], 'E', 1],
 [2, 'H', [0], '-', 3],
 [3, 'H', [0], '-', 3],
 [4, 'H', [0], '-', 3],
 [5, 'H', [1], '-', 2],
 [6, 'O', [1], 'X', 0],
 [7, 'H', [1], '-', 2]]

尾注：您可以利用您的数据结构来避免将发现作为单独的变量使用，但我将其包含在此处是为了更符合 wiki 页面上的伪代码。这样比较python代码和理论比较容易。

【讨论】：

【解决方案2】：

由于这是一个图形问题，正如 Christian Sloper 所提到的，您也可以使用 networkx：

import networkx as nx

ls = [[0, 'C', [1, 2, 3, 4], 'E', []],
      [1, 'C', [0, 5, 6, 7], 'E', []],
      [2, 'H', [0], '-', []],
      [3, 'H', [0], '-', []],
      [4, 'H', [0], '-', []],
      [5, 'H', [1], '-', []],
      [6, 'O', [1], 'X', []],
      [7, 'H', [1], '-', []]]

# identify atom marked as X
x_index = next(filter(lambda x: x[3] == "X", ls))[0]

G = nx.Graph()
G.add_nodes_from([atom[0] for atom in ls])
G.add_edges_from(set((a1[0], a2) for a1 in ls for a2 in a1[2]))
map_atom_dist = nx.single_source_shortest_path_length(G, x_index)
# map_atom_dist: {6: 0, 1: 1, 0: 2, 5: 2, 7: 2, 2: 3, 3: 3, 4: 3}

[atom[:-1] + [map_atom_dist[atom[0]]] for atom in ls]

# Output:
[[0, 'C', [1, 2, 3, 4], 'E', 2],
 [1, 'C', [0, 5, 6, 7], 'E', 1],
 [2, 'H', [0], '-', 3],
 [3, 'H', [0], '-', 3],
 [4, 'H', [0], '-', 3],
 [5, 'H', [1], '-', 2],
 [6, 'O', [1], 'X', 0],
 [7, 'H', [1], '-', 2]]

【讨论】：

【解决方案3】：

更短的递归解决方案：

ls = [[0, 'C', [1, 2, 3, 4], 'E', []], [1, 'C', [0, 5, 6, 7], 'E', []], [2, 'H', [0], '-', []], [3, 'H', [0], '-', []], [4, 'H', [0], '-', []], [5, 'H', [1], '-', []], [6, 'O', [1], 'X', []], [7, 'H', [1], '-', []]]
d_ls = {a:b for a, *b in ls} #convert ls to a dictionary for faster lookup
def to_node(target, nodes, c = []):
   if any(d_ls[i][0] == target for i in nodes):
      yield len(c)+1
   else:
      for i in filter(lambda x:x not in c, nodes):
         yield from to_node(target, d_ls[i][1], c+[i])

r = [[*a, 0 if a[1] == 'O' else next(to_node('O', a[2]), [a[0]])] for *a, _ in ls]

输出：

[[0, 'C', [1, 2, 3, 4], 'E', 2], 
 [1, 'C', [0, 5, 6, 7], 'E', 1], 
 [2, 'H', [0], '-', 3], 
 [3, 'H', [0], '-', 3], 
 [4, 'H', [0], '-', 3], 
 [5, 'H', [1], '-', 2], 
 [6, 'O', [1], 'X', 0], 
 [7, 'H', [1], '-', 2]]

【讨论】：