使用带有矢量化的 numpy.einsum 的 Numpy 矩阵乘法答案

【问题标题】：Numpy martix multiplication using numpy.einsum with vectorization使用带有矢量化的 numpy.einsum 的 Numpy 矩阵乘法
【发布时间】：2020-10-30 13:08:45
【问题描述】：

我想旋转图像。

start 和 normal 的形状是 (429, 1024, 3) 腐烂的形状是 (3, 3) 以下代码运行正常，但需要一些时间才能完成。

#rotation 30 degree
s = numpy.sin(numpy.pi * 30 / 180)
c = numpy.cos(numpy.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]


for i in range(height):
  for j in range(width):
    for arr in [start, norm]:
        x = arr[i,j,0]
        y = arr[i,j,1]
        z = arr[i,j,2]
        for d in range(3):
            arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z

我尝试对代码进行矢量化，但有条件使用 numpy.einsum 作为每个像素的矢量需要相乘。

#Moving 30 degree
s = numpy.sin(numpy.pi * 30 / 180)
c = numpy.cos(numpy.pi * 30 / 180)

rot = numpy.array([[1.0, 0.0, 0.0], [0.0,   c,   s], [0.0,  -s,   c]])
 
start[:,:,:3] = numpy.einsum('ij,j',rot[:3,0],start[:,:,0]) + 
 numpy.einsum('ij,j',rot[:3,1],start[:,:,1]) + numpy.einsum('ij,j',rot[:3,2],start[:,:,2])

norm[:,:,:3] = numpy.einsum('ij,j',rot[:3,0],norm[:,:,0]) + 
 numpy.einsum('ij,j',rot[:3,1],norm[:,:,1]) + numpy.einsum('ij,j',rot[:3,2],norm[:,:,2])

上面的代码给出了错误“einstein sum subscripts string contains too many subscripts for operand 0”。

我应该对矢量化形式的代码做哪些更改？？

【问题讨论】：

这不是start.dot(rot)的一种非常复杂的做法吗？
这看起来不像传统意义上的图像旋转。它会旋转颜色空间中单个像素的颜色。
好收获@QuangHoang
rot[:3,0] 是一维数组，因此您不能指定 ij 作为其下标。 einsum 不“知道”rot 是 2d，它只看到索引的结果。 start[:,:,0] 是 2d，因此仅在下标 j 上不起作用。使用einsum 时，索引表达式必须与参数兼容。密切关注！

标签： python image numpy

【解决方案1】：

你可以使用下一个简单的代码：

a_ = np.empty_like(a)
for d in range(3):
    a_[:, :, d] = rot[d][0] * a[:, :, 0] + rot[d][1] * a[:, :, 1] + rot[d][2] * a[:, :, 2]
a = a_

下面的完整用法示例：

Try it online!

import numpy as np

height, width = 100, 179
a = np.random.uniform(0., 1., (height, width, 3))
a0, a1 = np.copy(a), np.copy(a)

s = np.sin(np.pi * 30 / 180)
c = np.cos(np.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]

# -------- Version 1, non-vectorized --------------

for i in range(height):
  for j in range(width):
    for arr in [a0]:
        x = arr[i,j,0]
        y = arr[i,j,1]
        z = arr[i,j,2]
        for d in range(3):
            arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z

# -------- Version 2, vectorized --------------

a1_ = np.empty_like(a1)
for d in range(3):
    a1_[:, :, d] = rot[d][0] * a1[:, :, 0] + rot[d][1] * a1[:, :, 1] + rot[d][2] * a1[:, :, 2]
a1 = a1_

# ------ Check that we have same solution --------
assert np.allclose(a0, a1)

下面这两种解决方案的时间测量代码，矢量化解决方案似乎比非矢量化解决方案快48x 倍，代码需要通过python -m pip install numpy timerit 安装一次模块：

# Needs: python -m pip install numpy timerit

import numpy as np

s = np.sin(np.pi * 30 / 180)
c = np.cos(np.pi * 30 / 180)

rot = [[1.0, 0.0, 0.0],
  [0.0,   c,   s],
  [0.0,  -s,   c]]

# -------- Version 1, non-vectorized --------------

def f0(a):
    height, width = a.shape[:2]
    a = np.copy(a)
    for i in range(height):
      for j in range(width):
        for arr in [a]:
            x = arr[i,j,0]
            y = arr[i,j,1]
            z = arr[i,j,2]
            for d in range(3):
                arr[i,j,d] = rot[d][0] * x + rot[d][1] * y + rot[d][2] * z
    return a

# -------- Version 2, vectorized --------------

def f1(a):
    height, width = a.shape[:2]
    a_ = np.empty_like(a)
    for d in range(3):
        a_[:, :, d] = rot[d][0] * a[:, :, 0] + rot[d][1] * a[:, :, 1] + rot[d][2] * a[:, :, 2]
    a = a_
    return a

# ----------- Time/Speedup Measure ----------------

from timerit import Timerit
Timerit._default_asciimode = True

height, width = 100, 179
a = np.random.uniform(0., 1., (height, width, 3))

ra, rt = None, None
for f in [f0, f1]:
    print(f'{f.__name__}: ', end = '', flush = True)
    tim = Timerit(num = 15, verbose = 1)
    for t in tim:    
        ca = f(a)
    ct = tim.mean()
    if ra is None:
        ra, rt = ca, ct
    else:
        assert np.allclose(ra, ca)
        print(f'speedup {round(rt / ct, 2)}x')

输出：

f0: Timed best=144.785 ms, mean=146.205 +- 1.9 ms
f1: Timed best=2.781 ms, mean=3.022 +- 0.1 ms
speedup 48.38x

【讨论】：

感谢矢量化代码，但只想用 np.einsum 替换 rot[d][0] * a[:, :, 0]。
@KetanChaudhari 如果您特别需要np.einsum 而没有其他方法，那么我无法帮助您，从未使用过np.einsum，很遗憾...

【解决方案2】：

从我在您的代码中可以看出，正确的einsum 签名应该是：

start = np.einsum('ijl, kl -> ijk', start, rot)
norm  = np.einsum('ijl, kl -> ijk',  norm, rot)

但是图片数组的最后一个维度是 RGB 颜色，而不是 XYZ 坐标（正如 cmets 中的 @QuongHoang 所指出的那样），因此这不会像您想要的那样“旋转”图片。您将只是旋转色彩空间

【讨论】：

它给出错误“ValueError: einstein sum subscripts string contains too many subscripts for operand 0”
什么是start.shape？
起始形状为 (429, 1024, 3)
和 rot.shape 是 (3,3)，所以它应该可以工作（并且在我的机器上使用这些形状）
不适用于 np.einsum('ijl, kl -> ijk', start[:, :, 0], rot[d][0])