你将如何优化这个简单但缓慢的 Python 'for' 循环？答案

【问题标题】：How would you optimize this simple but slow Python 'for' loop?你将如何优化这个简单但缓慢的 Python 'for' 循环？
【发布时间】：2020-06-21 15:09:22
【问题描述】：

基本上，我有一个函数，对于每一行，一次将一个额外的值相加，直到总和达到给定阈值。一旦达到给定的阈值，它就会获取生成的切片索引并使用它来返回另一列的该切片的平均值。

import numpy as np

#Random data:
values = np.random.uniform(0,10,300000)
values2 = np.random.uniform(0,10,300000)
output = [0]*len(values)

#Function that operates one one single row and returns the mean
def function(threshold,row):
    slice_sum=0
    i=1
    while slice_sum < threshold:
        slice_sum = values[row-i:row].sum()
        i=i+1        
    mean = values2[row-i:row].mean()
    return mean


#Loop to iterate the function row by row:
for i in range(15,len(values)): #let's just skip the first 15 values, otherwise the loop might get stuck. This issue is not prioritary though.
    output[i] = function(40,i)

这是循环的简化版本。它可能看起来并不慢，但从所有意图和实际目的来看，它都非常慢。所以我想知道是否有更快的方法可以在没有 for 循环的情况下实现这一目标。

谢谢

【问题讨论】：

只是为了简化问题，您不需要预先分配output 到任何特定长度。只需使用output = [function(40, i) for i in range(15, len(values))]。

标签： python numpy loops iteration

【解决方案1】：

使用 searchsorted 对 values 的累积总和直接导航到下一个组。这将为您提供 O(n log n) 性能，其中 n 是值中的组数：

import numpy as np

def meanBlocks(values,values2,threshold):
    sums = np.cumsum(values)
    i = j = k = 0
    output = np.zeros(values.size)
    while j < values.size:
        s = sums[j]-values[j]+threshold     # s is next cumsum to reach
        i,j = j,np.searchsorted(sums,s)     # position of next increment by threshold 
        output[k] = np.mean(values2[i:j])   # track mean of values2 for range
        k += 1
    return output[:k]

输出：

values  = np.arange(10)
values2 = np.arange(10)*5
print(values)
print(values2)
print(meanBlocks(values,values2,13))

[0 1 2 3 4 5 6 7 8 9]           #   (0+1+2+3+4)    (5+6)     (7)   ...   
[ 0  5 10 15 20 25 30 35 40 45] # (0,5,10,15,20)  (25,30)    (35)  ...
[10.  27.5 35.  40.  45. ]      #   50/5 = 10    55/2=27.5    35   ...


print("")
values    = np.random.uniform(0,10,300000)
values2   = np.random.uniform(0,10,300000)
print(values)
print(values2)
print(meanBlocks(values,values2,40)) # takes 0.43 sec on my laptop

[6.79333765 2.22880971 1.37706989 ... 8.75649835 2.92422716 5.1280224 ]
[3.56901367 0.15243962 6.76291706 ... 4.47662928 2.61969948 8.0941208 ]
[4.88477774 3.87464821 5.42599828 ... 4.47055786 4.48768768 5.17582407]

【讨论】：

【解决方案2】：

您无需在每次循环中都重新计算总和。您从values[row-1:row]（单个值）开始，如果它足够小，则添加一个附加值。与其在迭代后对相同的值迭代重新求和，不如用下一个值增加前一个总和。

def function(threshold, row):
    slice_sum = 0
    for i in range(1, len(values)+1):
        slice_sum += values[row-i]
        if slice_sum >= threshold:
            break
    return values2[row-i-1:row].mean()

这将加法运算的次数从 O(n^2) 减少到 O(n)。

【讨论】：