numpy append_field 为具有 2d 形状的新字段提供形状错误答案

【问题标题】：numpy append_field gives shape error for new field with 2d shapenumpy append_field 为具有 2d 形状的新字段提供形状错误
【发布时间】：2012-12-10 05:17:53
【问题描述】：

我有一个结构化的 numpy 数组，我想使用 recfunctions 库 http://pyopengl.sourceforge.net/pydoc/numpy.lib.recfunctions.html 函数 append_fields() 或 rec_append_fields() 以附加一些字段塑造它。但是，我收到一个错误：

ValueError: 操作数不能与形状一起广播 (10) (10,3)

10 是我现有数组的长度，(3,) 是我要附加的字段的形状。

例如：

import numpy as np
from numpy.lib.recfunctions import append_fields


my_structured_array = np.array(
    zip([0,1,2,3],[[4.3,3.2],[1.4,5.6],[6.,2.5],[4.5,5.4]]),
    dtype=[('id','int8'),('pos','2float16')]
    )
my_new_field = np.ones(
    len(my_structured_array),
    dtype='2int8'
    )
my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field
    )

ValueError: 操作数不能与形状一起广播 (4) (4,2)

有什么想法吗？我尝试制作my_new_field 一个元组列表并放置一个dtype 将适当形状的参数放入 append_fields()：

my_new_field = len(my_structured_array)*[(1,1)]

my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field,
    dtype='2int8'
    )

但是一旦它被转换为一个 numpy 数组，结果似乎是一样的。

当我使用 rec_append_fields() 而不是简单地使用时，这一切似乎都没有改变 append_fields()

编辑：鉴于我的新字段与我的数组的形状不同，我想我想要的追加是不可能的，@radicalbiscuit 建议。

In : my_new_field.shape
Out: (4, 2)

In : my_structured_array.shape
Out: (4,)

但是，为了说明我的观点，我在数组中包含了一个形状与原始数组不同的原始字段，即字段不必具有与结构化数组相同的形状。我怎样才能附加这样的字段？

In : my_structured_array['pos'].shape
Out: (4, 2)

In : my_new_field.shape
Out: (4, 2)

我应该注意，对于我的应用程序，我可以附加一个空字段，只要以后可以以某种方式更改形状即可。谢谢！

【问题讨论】：

标签： python numpy

【解决方案1】：

append_fields() 确实要求两个数组的形状相同。话虽如此，正如您在my_structured_array 中意识到的那样，numpy 确实支持子数组（也就是说，字段本身可以是具有形状的数组）。

在您的情况下，我认为您可能希望 my_new_field 不是二维数组，而是具有 dtype 元素的一维数组（形状为 shape(my_structured_array)），例如 dtype([('myfield', '<i8', (2,))])。例如，

import numpy as np
from numpy.lib.recfunctions import append_fields

my_structured_array = np.array(
    zip([0,1,2,3],[[4.3,3.2],[1.4,5.6],[6.,2.5],[4.5,5.4]]),
    dtype=[('id','int8'),('pos','2float16')]
    )

my_new_field = np.ones(
    len(my_structured_array),
    dtype=[('myfield', 'i8', 2)]
    )

my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field
    )

会产生，

>>> my_appended_array[0]
(0, [4.30078125, 3.19921875], ([1, 1],))

虽然myfield 嵌套在new 中，所以数据类型有点不方便，

>>> my_appended_array.dtype
dtype([('id', '|i1'), ('pos', '<f2', (2,)), ('new', [('myfield', '<i8', (2,))])])

然而，这很容易被强制删除，

>>> np.asarray(my_appended_array, dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])
array([(0, [4.30078125, 3.19921875], [0, 0]),
       (1, [1.400390625, 5.6015625], [0, 0]), (2, [6.0, 2.5], [0, 0]),
       (3, [4.5, 5.3984375], [0, 0])], 
      dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])

不过，我们不得不在此处重复 my_structured_array 的 dtype 有点遗憾。乍一看，numpy.lib.recfunctions.flatten_descr 似乎可以完成扁平化 dtype 的繁琐工作，但不幸的是，它提供了一个元组，而不是np.dtype 所要求的列表。但是，将其输出强制到列表中可以解决此问题，

>>> np.dtype(list(np.lib.recfunctions.flatten_descr(my_appended_array.dtype)))
dtype([('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])

这可以作为 dtype 传递给np.asarray，使事情对my_structured_array.dtype 的更改更加稳健。

确实，诸如此类的细微不一致会使处理记录数组变得一团糟。人们会觉得事情可以更加连贯地结合在一起。

编辑： 事实证明，np.lib.recfunctions.merge_arrays 函数更适合这种合并，

 >>> my_appended_array = merge_arrays([my_structured_array, my_new_field], flatten=True)
 array([(0, [4.30078125, 3.19921875], [1, 1]),
        (1, [1.400390625, 5.6015625], [1, 1]), (2, [6.0, 2.5], [1, 1]),
        (3, [4.5, 5.3984375], [1, 1])], 
       dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])

【讨论】：

因此，似乎在 my_new_field 中使用命名字段是关键，但这给出了嵌套的 dtypes。我认为如果新字段必须有形状，那么最好的办法是创建一个新数组，然后合并或加入两个形状相似的结构化数组，如：stackoverflow.com/questions/5355744/…

【解决方案2】：

append_fields() 要求两个数组的形状相同，在这种情况下它们不是。打印出这两个数组将有助于它变得明显：

>>> my_structured_array
array([(0, [4.30078125, 3.19921875]), (1, [1.400390625, 5.6015625]),
       (2, [6.0, 2.5]), (3, [4.5, 5.3984375])], 
      dtype=[('id', '|i1'), ('pos', '<f2', (2,))])
>>> my_new_field
array([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1]], dtype=int8)

如您所见，my_structured_array 是一个长度为 4 的数组，其中每个元素是一个元组，其中包含两个对象、一个 int 和一个包含两个浮点数的列表。

另一方面，

my_new_field 是一个长度为 4 的数组，其中每个元素是两个整数的列表。这就像尝试添加苹果和橙子。

使您的数组具有相同的形状，它们将相加。

【讨论】：

但我的理解是数组中的字段可以具有与结构化数组不同的形状： In : my_new_field.shape Out: (4, 2) In : my_structured_array.shape Out: (4,) In : my_structured_array['pos'].shape Out: (4, 2) 如何将形状相似的字段附加到数组中已有的字段之一？