【发布时间】:2017-02-21 16:51:10
【问题描述】:
我一直在研究这个函数,它会为我正在开发的模拟代码生成一些我需要的参数,并且在提高其性能方面遇到了困难。
分析代码表明这是主要瓶颈,因此我可以对其进行的任何改进,无论多么微小都会很棒。
我想尝试对这个函数的某些部分进行矢量化,但我不确定是否可行。
主要挑战是存储在我的数组params 中的参数取决于参数的索引。我看到的唯一直接解决方案是使用np.ndenumerate,但这似乎很慢。
是否可以对这种类型的操作进行矢量化,其中存储在数组中的值取决于它们的存储位置?或者创建一个只给我数组索引的元组的生成器会更聪明/更快吗?
import numpy as np
from scipy.sparse import linalg as LA
def get_params(num_bonds, energies):
"""
Returns the interaction parameters of different pairs of atoms.
Parameters
----------
num_bonds : ndarray, shape = (M, 20)
Sparse array containing the number of nearest neighbor bonds for
different pairs of atoms (denoted by their column) and next-
nearest neighbor bonds. Columns 0-9 contain nearest neighbors,
10-19 contain next-nearest neighbors
energies : ndarray, shape = (M, )
Energy vector corresponding to each atomic system stored in each
row of num_bonds.
"""
# -- Compute the bond energies
x = LA.lsqr(num_bonds, energies, show=False)[0]
params = np.zeros([4, 4, 4, 4, 4, 4, 4, 4, 4])
nn = {(0,0): x[0], (1,1): x[1], (2,2): x[2], (3,3): x[3], (0,1): x[4],
(1,0): x[4], (0,2): x[5], (2,0): x[5], (0,3): x[6], (3,0): x[6],
(1,2): x[7], (2,1): x[7], (1,3): x[8], (3,1): x[8], (2,3): x[9],
(3,2): x[9]}
nnn = {(0,0): x[10], (1,1): x[11], (2,2): x[12], (3,3): x[13], (0,1): x[14],
(1,0): x[14], (0,2): x[15], (2,0): x[15], (0,3): x[16], (3,0): x[16],
(1,2): x[17], (2,1): x[17], (1,3): x[18], (3,1): x[18], (2,3): x[19],
(3,2): x[19]}
"""
params contains the energy contribution of each site due to its
local environment. The shape is given by the number of possible atom
types and the number of sites in the lattice.
"""
for (i,j,k,l,m,jj,kk,ll,mm), val in np.ndenumerate(params):
params[i,j,k,l,m,jj,kk,ll,mm] = nn[(i,j)] + nn[(i,k)] + nn[(i,l)] + \
nn[(i,m)] + nnn[(i,jj)] + \
nnn[(i,kk)] + nnn[(i,ll)] + nnn[(i,mm)]
return np.ascontiguousarray(params)
【问题讨论】:
标签: python performance python-2.7 numpy vectorization