加速 python 代码 - 我可以向量化双循环吗？答案

【问题标题】：Speed up python code - can I vectorize double for loop?加速 python 代码 - 我可以向量化双循环吗？
【发布时间】：2014-01-29 04:27:02
【问题描述】：

我是 python 新手。我将dbscan 代码用于集群目的并进行了一些更改。现在代码运行良好，但速度很慢。所以我发现我必须从我的代码中删除'for循环'。这是代码的一部分：

class Point:
    def __init__(self, x = 0, y = 0, visited = False, isnoise = False):
        self.x = x  
        self.y = y  
        self.visited = False  
        self.isnoise = False  

    def show(self):  
        return self.x, self.y 

    def dist(self, p1, p2):  
        #Calculate the great circle distance between two points on the earth (specified in decimal degrees)return distance between two point  
        # convert decimal degrees to radians 
        dlat = radians(p2.x-p1.x)
        dlon = radians(p2.y-p1.y)
        a = sin(dlat/2) * sin(dlat/2) + cos(radians(p1.x))* cos(radians(p2.x)) * sin(dlon/2) * sin(dlon/2)
        c = 2 * atan2(sqrt(a), sqrt(1-a))
        d = 6371 * c
        return d 

    def distanceQuery(self,neighbor_pts):
        dista=[]
        for i in range(len(neighbor_pts)):
          for j in range(i+1,len(neighbor_pts)):
            z=self.dist(neighbor_pts[i],neighbor_pts[j])
            dista.append(z)
        return max(dista)

distanceQuery 函数使用双循环。有什么办法可以删除这个吗？我可以向量化这个双循环吗？由于这是集群代码，因此有一些步骤需要附加。我已经读过 numpy 数组在追加时的工作方式与 python 列表不同。附加 numpy 数组效率低下。

编辑：

所以这可以向量化。但这是代码的其他部分，在我检查某些条件之后发生追加。

def expandCluster(self, P, neighbor_points):  
     self.cluster[self.cluster_inx].append(P)  
     iterator = iter(neighbor_points)  
     while True:  
       try:   
         npoint_tmp = iterator.next()  
       except StopIteration:  
         # StopIteration exception is raised after last element  
         break  
       if (not npoint_tmp.visited):  
         #for each point P' in NeighborPts   
         npoint_tmp.visited = True  
         NeighborPts_ = self.regionQuery(npoint_tmp)  
         if (len(NeighborPts_) >= self.MinPts):  
           for j in range(len(NeighborPts_)):  
            neighbor_points.append(NeighborPts_[j])
            if self.distanceQuery(neighbor_points)>0.10:
              break

现在，如果我也对neighbor_points 进行矢量化。我将不得不解决附加问题？所以每个点都会附加到 neighbour_points 中，然后它会生成一个 distanceQuery 。而这个过程也是一个迭代的一部分。所以这里也有两个循环。我只是想确保在 numpy 数组中追加不会效率低下

【问题讨论】：

我认为这可能是一个 XY 问题 - 你能描述一下你的目标是什么，而不仅仅是让你到达那里的部分路线吗？我认为可能有一个解决方案涉及scipy.spatial.KDTree
@Eric 我的目标是使用 dbscan 进程获取集群，但同时使用“该集群中任意两点之间的最大距离”限制这些集群。我正在尝试控制集群的大小

标签： python numpy vectorization

【解决方案1】：

import numpy as np

def dist(p1, p2): 
    # Initially, p1.shape() == (n, 2) and p2.shape() == (m, 2)
    # Now, p1.shape() == (1, n, 2) and p2.shape() == (m, 1, 2)
    p1 = p1[np.newaxis, :, :]
    p2 = p2[:, np.newaxis, :]

    # get all the vectory things
    from numpy import sin, cos, radians, sqrt, arctan2 as atan2 

    # do the same math as before, but use `p[..., 0]` instead of `p.x` etc
    dlat = radians(p2[..., 0] - p1[..., 0])
    dlon = radians(p2[..., 1] - p1[..., 1])
    a = sin(dlat/2) * sin(dlat/2) + cos(p1[..., 0])*cos(p2[..., 0]) * sin(dlon/2) * sin(dlon/2)
    c = 2 * atan2(sqrt(a), sqrt(1-a))
    d = 6371 * c
    return d 

def distanceQuery(neighbor_pts):
    return np.max(dist(neighbor_pts, neighbor_pts))

例如：

>>> points = np.array([[0, 0], [45, 0], [45, 45], [90, 0]], dtype=float) 
>>> dist(points, points)
array([[     0.        ,   5003.77169901,   6272.52596983,  10007.54339801],
       [  5003.77169901,      0.        ,   2579.12525679,   5003.77169901],
       [  6272.52596983,   2579.12525679,      0.        ,   4347.69702221],
       [ 10007.54339801,   5003.77169901,   4347.69702221,      0.        ]])
>>> np.max(_)
10007.543398010286

时间：

def dist_slow(p1, p2):
    """your function, adjusted to take an array instead of a `Point`"""
    from math import radians, cos, sqrt, atan2
    # compute the distance for all possible pairs
    dlat = radians(p2[0]-p1[0])
    dlon = radians(p2[1]-p1[1])

    a = sin(dlat/2) * sin(dlat/2) + cos(radians(p1[0]))*cos(radians(p2[0])) * sin(dlon/2) * sin(dlon/2)
    c = 2 * atan2(sqrt(a), sqrt(1-a))
    d = 6371 * c
    return d

def query_iter(p):
    return max(dist_slow(p1, p2) for p1, p2 in itertools.combinations(p, 2))

def query_orig(p):
    dista=[]
    for i in range(len(p)):
      for j in range(i + 1, len(p)):
        z = dist_slow(p[i], p[j])
        dista.append(z)
    return max(dista)

def query_mine(p):
    return dist(p, p).max()

然后：

>>> points = np.random.rand(1000, 2)
>>> timeit query_orig(points)
1 loops, best of 3: 7.94 s per loop
>>> timeit query_iter(points)
1 loops, best of 3: 7.35 s per loop
>>> timeit query_mine(points)
10 loops, best of 3: 150 ms per loop

【讨论】：

跟我差不多:-)
请注意，您还需要在顶部声明import numpy as np（至少对于np.newaxis）
这个 numpy 解决方案应该比其他人建议的列表理解要快得多（任何人检查它是否是？）
那么 d 是一个数组吗？我还必须在其他步骤中使用 dist。
@usethedeathstar：快了大约 50 倍

【解决方案2】：

您可以使用 numpy ufunc 以“矢量”形式执行所有操作：

from numpy import radians, sin, cos, sqrt, arctan2
from numpy import random

def max_dist(p1x,p1y,p2x,p2y):
    # give them "orthogonal" shape
    p1x = p1x.reshape(p1x.size,1)
    p1y = p1y.reshape(p1y.size,1)
    p2x = p2x.reshape(1,p2x.size)
    p2y = p2y.reshape(1,p2y.size)

    # compute the distance for all possible pairs
    dlat = radians(p2x-p1x)
    dlon = radians(p2y-p1y)

    a = sin(dlat/2) * sin(dlat/2) + cos(radians(p1x))*cos(radians(p2x)) * sin(dlon/2) * sin(dlon/2)
    c = 2 * arctan2(sqrt(a), sqrt(1-a))
    d = 6371 * c

    return d.max()


if __name__=='__main__':
    # generate random samples
    N = 1000
    p1x,p1y,p2x,p2y = random.rand(4,N)

    print 'max_dist=',max_dist(p1x,p1y,p2x,p2y)

【讨论】：

【解决方案3】：

不确定向量化，但您当然可以将双 for 循环转换为列表推导式。由于您只取该列表的最大值，因此您也可以使用生成器表达式。

def distGen(pts):
    return max(dist(pts[i], pts[j]) for i in range(len(pts)) 
                                    for j in range(i+1, len(pts)))

我对此进行了一些时序分析，这似乎至少要快一点。有趣的是，与我的直觉相反，使用列表推导式而不是生成器更快，但生成器应该具有使用更少内存的优势。

1.15502595901   # your approach
1.37675499916   # your approach single max value var instead of list
1.00971293449   # above generator expression
0.916918992996  # above with list comprehension, i.e., max([...])

（使用 Python 2.7 测试，使用 1000 个随机数的列表而不是点，dist 测量这些数字之间的绝对距离。）

使用itertools.combinations 来获得两点的所有组合更好——更干净而且更快一点：

import itertools
def distComb(pts):
    return max(dist(p1, p2) for p1, p2 in itertools.combinations(pts, 2))

【讨论】：

在生成器表达式和 combinations() 函数中仍然有双循环，尽管现在隐式表达。由于将部分执行置于 C 级别，因此减少了执行时间，但基本上算法保持不变。
@ffriend 当然，我没有说别的，不是吗？仍然需要围绕这些矢量化解决方案进行思考。对所有人 +1。 Sau，如果你接受不同的答案，我不介意。
@tobias_k：我并不是要责怪你或类似的事情，只是注意到了。对不起，如果听起来像那样:)
@ffriend 无意冒犯！ :-)

【解决方案4】：

这是另一种解决方案，它首先将所有点映射到一个单位球体上：

import numpy as np
import scipy.spatial

def sphereify(points):
    """lat, long -> x, y, z  for a unit sphere"""
    lat = np.radians(points[:, 0, np.newaxis])
    long = np.radians(points[:, 1, np.newaxis])
    return np.hstack((
        np.cos(lat) * np.cos(long),
        np.cos(lat) * np.sin(long),
        np.sin(lat)
    ))


def arcDistance(chordDistance):
    """Get the surface distance corresponding to the chord distance""" 
    return np.arcsin(chordDistance / 2) * 2

earthRadius = 6371
def query(points):
    dists = scipy.spatial.distance.pdist(sphereify(points))
    surfaceDist = earthRadius * arcDistance(dist.max())
    return surfaceDist

然后：

>>>  timeit query(points)
100 loops, best of 3: 6.23 ms per loop

【讨论】：