Python numpy 属性错误'_collapse'答案

【问题标题】：Python numpy Attribute Error '_collapse'Python numpy 属性错误'_collapse'
【发布时间】：2015-03-02 05:45:24
【问题描述】：

我在 python 中遇到了最奇怪的属性错误，我似乎在网上找不到任何关于它的信息。我正在尝试对矩阵 y 的所有列的元素求和，并将它们保存在一个新矩阵中。 y 是 1. 和 0. 的 1063 x 1063 单位矩阵。 mat 是一个 70000 x 1063 的稀疏矩阵

mat = scipy.sparse.rand(70000, 1063, density=0.01, format='coo', dtype=None, random_state=None)
mat.shape

给我：

(70000, 1063)

现在我创建 y，一个 1063 x 1063 的单位矩阵：

y = np.matlib.identity(1063)  
ind = np.nonzero((mat.sum(axis=0) < 20))
y[ind, :] = 0                 # replace element at given index with 0 

x = np.sum(y, axis=1)         # here i want to count the elements of all columns of y

我收到关于最后一行的以下错误：

AttributeError: 'numpy.ndarray' object has no attribute '_collapse'

我迷路了。关于如何解决这个问题的任何想法？

【问题讨论】：

你在ndarray中传递什么类型的数据，y？
y 是一个浮点矩阵
type(y) 给了什么？
哦，它给出了：numpy.matrixlib.defmatrix.matrix
什么形状？ y.shape 最好提供一个最小的非工作示例。 How to ask

标签： python numpy matrix pandas attributeerror

【解决方案1】：

在matrixlib/defmtrix.py_collapse中定义为Matrix类的方法：

def _collapse(self, axis):
    """A convenience function for operations that want to collapse
    to a scalar like _align, but are using keepdims=True
    """
    if axis is None:
        return self[0, 0]
    else:
        return self

_collapse 用于：

def sum(self, axis=None, dtype=None, out=None):
    return N.ndarray.sum(self, axis, dtype, out, keepdims=True)._collapse(axis)

在.mean、.prod、.any、.max 等方法中也可以这样使用。基本上任何通常会降低矩阵维度的操作。

通常这些操作返回一个与输入相同类型的数组，所以如果y 是一个矩阵，它应该返回一个矩阵。由于矩阵始终是 2d，因此使用 keepdims=True。 ._collapse 在操作将矩阵减少为标量的情况下需要（例如，轴为无）。然后我们想要一个真正的标量，而不是包裹在矩阵中的。

我怀疑这部分代码多年来是否发生了变化（我会在 github 上仔细检查）。

所以它是为matrix 定义的，而不是ndarray。

In [154]: np.matrix([[1,0],[0,1]])._collapse(0)
Out[154]: 
matrix([[1, 0],
        [0, 1]])

In [155]: np.array([[1,0],[0,1]])._collapse(0)
...
AttributeError: 'numpy.ndarray' object has no attribute '_collapse'

即使输入为 1，np.sum 似乎也没有返回 matrix。

不知道其他归约函数是否有同样的问题，例如

y.max(axis=0)
np.add.reduce(y, axis=0)

y.max、y.prod 等都与y.sum 编码相同。对于matrix，这意味着使用底层ndarray 函数，后跟._collapse。

np.add.reduce(y, axis=1, keepdims=True) 在功能上非常相似，尽管通往底层 C 代码的路径不同。并且它不会尝试调用._collapse，这意味着对于axis=None，它不会将结果简化为标量；它留下了一个(1,1) 矩阵。 ._collapse 仍然可以使用，如：

np.add.reduce(np.matrix('1 2 3; 4 5 6'),axis=None, keepdims=True)._collapse(None)
# 21

解决np.sum 问题的另一种选择是将y 转换为数组（并可选择返回matrix）：

np.matrix(np.sum(y.A, axis=1, keepdims=True))

sparse 采用另一条路径到达.sum - 将矩阵与 1 的矩阵相乘：

y * np.asmatrix(np.ones((y.shape[1],1),int))

我想知道您的问题是否是由您正在导入的其他模块引起的，该模块正在覆盖某些定义，例如类型 matrix。你有一个pandas 标签。这是否意味着您加载 pandas 作为此计算的一部分？我不是在责怪pandas，但它表明程序环境更复杂。用最简单的程序尝试计算。

【讨论】：

感谢您的回答。我尝试了 x = y.sum(axis=1) 并再次遇到相同的错误。所以你说这可能是因为 y 是一个矩阵，我尝试对其进行 ndarray 操作？
如果y 是数组，则matrix 版本的sum 将不会被调用，也不会尝试调用._collapse。 np.matrix([]).sum(0) 的行为如何？
np.prod(y, axis=1) 是否给出相同的错误？（或任何其他“减少”方法）？
抱歉回复晚了！我尝试了所有这些，它适用于np.add.reduce(y, axis=0) - 但是，它给了我与_collapse 相同的错误y.max(axis=0) 以及np.prod(y, axis=1)。我无法解释。有没有其他方法可以对矩阵的列求和？
np.add.reduce(y, axis=1)?我添加了一些关于使用 add.reduce 和替代方法的想法。