扩展 numpy 掩码答案

【问题标题】：Extending numpy mask扩展 numpy 掩码
【发布时间】：2016-02-17 16:20:15
【问题描述】：

我想用mask 屏蔽一个numpy 数组a。掩码与a 的形状不完全相同，但无论如何都可以掩码a（我猜是因为附加维度是一维的（广播？））。

a.shape
>>> (3, 9, 31, 2, 1)
mask.shape
>>> (3, 9, 31, 2)
masked_a = ma.masked_array(a, mask)

但是，同样的逻辑不适用于数组 b，它的最后一维有 5 个元素。

ext_mask = mask[..., np.newaxis] # extending or not extending has same effect
ext_mask.shape
>>> (3, 9, 31, 2, 1)

b.shape
>>> (3, 9, 31, 2, 5)
masked_b = ma.masked_array(b, ext_mask)
>>> numpy.ma.core.MaskError: Mask and data not compatible: data size is 8370, mask size is 1674.

如何通过将(3, 9, 31, 2) 掩码的最后一个维度中的任何True 值扩展为[True, True, True, True, True]（分别为False）从(3, 9, 31, 2) 掩码创建(3, 9, 31, 2, 5) 掩码？

【问题讨论】：

这行得通：masked_b = ma.masked_array(*np.broadcast(b, ext_mask))，但我不知道为什么ma.masked_array 没有自动广播。编辑：也许是因为它只想将视图存储到两个大小相等的数组中以提高效率？
这给了TypeError: __new__() takes at most 11 arguments (8371 given)
呃，对不起，我的错！ broadcast 是错误的功能。您需要使用broadcast_arrays。
文档说broadcast_arrays 将视图返回到原始数组中，这意味着不执行任何分配。
是的，我会写一个答案，但首先我要对该主题进行更多研究:)

标签： python arrays numpy

【解决方案1】：

这给出了预期的结果：

masked_b = ma.masked_array(*np.broadcast(b, ext_mask))

我没有分析过这个方法，但它应该比分配一个新的掩码更快。根据documentation，没有数据被复制：

这些数组是原始数组的视图。他们通常不是连续的。此外，广播数组的多个元素可能指的是单个内存位置。如果您需要写信到数组，先复制。

可以验证不复制行为：

bb, mb = np.broadcast(b, ext_mask)
print(mb.shape)       # (3, 9, 31, 2, 5) - same shape as b
print(mb.base.shape)  # (3, 9, 31, 2) - the shape of the original mask
print(mb.strides)     # (558, 62, 2, 1, 0) - that's how it works: 0 stride

numpy 开发人员如何实现广播令人印象深刻。沿最后一个维度使用 0 的步幅重复值。哇！

编辑

我用这段代码比较了广播和分配的速度：

import numpy as np
from numpy import ma

a = np.random.randn(30, 90, 31, 2, 1)
b = np.random.randn(30, 90, 31, 2, 5)

mask = np.random.randn(30, 90, 31, 2) > 0
ext_mask = mask[..., np.newaxis]

def broadcasting(a=a, b=b, ext_mask=ext_mask):
    mb1 = ma.masked_array(*np.broadcast_arrays(b, ext_mask))

def allocating(a=a, b=b, ext_mask=ext_mask):
    m2 = np.empty(b.shape, dtype=bool)
    m2[:] = ext_mask
    mb2 = ma.masked_array(b, m2)

广播显然比分配快，这里：

    # array size: (30, 90, 31, 2, 5)

In [23]: %timeit broadcasting()
The slowest run took 10.39 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 39.4 µs per loop

In [24]: %timeit allocating()
The slowest run took 4.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 982 µs per loop

请注意，我必须增加数组大小才能使速度差异变得明显。原始数组维度的分配比广播略快：

    # array size: (3, 9, 31, 2, 5)

In [28]: %timeit broadcasting()
The slowest run took 9.36 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 39 µs per loop

In [29]: %timeit allocating()
The slowest run took 9.22 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 32.6 µs per loop

广播解决方案的运行时间似乎不依赖于数组大小。

【讨论】：

这两个测试使用了哪些数组大小？
(30, 90, 31, 2, x) 和 (3, 9, 31, 2, x)
有趣。广播似乎不太依赖数组大小（如果全部的话）。绝对是更好的选择 - 再次感谢。