注意!代码将被分解,并且相对于它的每个部分都没有正确缩进,因此我建议您也查看问题本身/itertools 文档(相同代码)中的代码。
问这个问题已经 7 年多了。哇。我自己对此很感兴趣,上面的解释虽然很有帮助,但对我来说并没有真正切中要害,所以这是我为自己做的总结。
由于我终于设法理解它(或者至少我认为我理解),我认为发布这个解释的“版本”可能是有益的,以防有更多像我一样的人。那么让我们开始吧。
def combinations(iterable, r):
pool = tuple(iterable)
n = len(pool)
在第一部分中,简单地创建一个可迭代的元组并获取可迭代的长度。这些在以后会有用。
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
这也很简单——如果所需组合的长度大于我们的元素池,我们无法构造一个有效的组合(你不能从 4 个元素中组合 5 个元素),因此我们只是停止执行带有退货声明。我们还生成了第一个组合(迭代中的前 r 个元素)。
下一部分稍微复杂一些,请仔细阅读。
while True:
for i in reversed(range(r)):
if indices[i] != n - (r - i):
break
"""
The job of the while loop is to increment the indices one after
the other, and print out all the possible element combinations based
off all the possible valid indice combinations.
This for loop's job is to make sure we never over-increment any values.
In order for us to not run into any errors, the incremention of
the last element of the indice list must stop when it reaches one-less
than the length of our element list, otherwise we'll run into an index error
(trying to access an indice out of the list range).
How do we do that?
The range function will give us values cascading down from r-1 to 0
(r-1, r-2, r-3, ... , 0)
So first and foremost, the (r-1)st indice must not be greater than (n-1)
(remember, n is the length of our element pool), as that is the largest indice.
We can then write
Indices[r - 1] < n - 1
Moreover, because we'll stop incrementing the r-1st indice when we reach it's
maximum value, we must also stop incrementing the (r-2)nd indice when we reach
it's maximum value. What's the (r-2)nd indice maximum value?
Since we'll also be incrementing the (r-1)st indice based on the
(r-2)nd indice, and because the maximum value of the (r-1)st
indice is (n-1), plus we want no duplicates, the maximum value the
(r-2)nd indice can reach would be (n-2).
This trend will continue. more generally:
Indices[r - k] < n - k
Now, since r - k is an arbitrary index generated by the reversed range function,
namely (i), we can substitute:
r - k = i -----> k = r - i
Indices[r - k] < n - k -----> Indices[i] < n - (r - i)
That's our limit - for any indice i we generate, we must break the
increment if this inequality { Indices[i] < n - (r - i) } is no longer
true.
(In the documentation it's written as (Indice[i] != i + n - r), and it
means the exact same thing. I simply find this version easier to visualize
and understand).
"""
else:
return
"""
When our for loop runs out - which means we've gone through and
maximized each element in our indice list - we've gone through every
single combination, and we can exit the function.
It's important to distinct that the else statement is not linked to
the if statement in this case, but rather to the for loop. It's a
for-else statement, meaning "If you've finished iterating through the
entire loop, execute the else statement".
"""
如果我们确实设法跳出 for 循环,这意味着我们可以安全地增加索引以获得下一个组合(下面的第一行)。
下面的 for 循环确保每次我们从一个新索引开始,
我们将其他索引重置为可能的最小值,
以免错过任何组合。
例如,如果我们不这样做,那么一旦我们到达一个点
我们必须继续前进,比如说我们有 (0, 1, 2, 3, 4) 和组合
索引是 (0, 1, 4),当我们继续并将 1 增加到 2 时,最后一个
索引将保持不变 - 4,我们将错过 (0, 2, 3),只有
将 (0, 2, 4) 注册为有效组合。相反,在我们递增之后
(1 -> 2),我们基于以下更新后面的索引:(4 -> 3),当我们
再次运行 while 循环,我们会将 3 增加到 4(请参阅
到上一节)。
请注意,我们从不增加以前的索引,因为不创建
重复。
最后,对于每次迭代,yield 语句都会生成与当前索引组合相对应的元素组合。
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
正如文档所述,因为我们正在处理位置,所以唯一的组合是唯一的,基于元素在可迭代对象中的位置,而不是它们的值。