如何优化 numpy 数组中的数组存储？答案

【问题标题】：How to optimize array storage within a numpy array?如何优化 numpy 数组中的数组存储？
【发布时间】：2022-01-18 12:27:59
【问题描述】：

我有一个形状为 (n, m) 的 numpy 数组：

import numpy as np
foo = np.zeros((5,5))

我做了一些计算，得到一个 (n, 2) 形状的结果：

bar = np.zeros((8,2))

我想将计算结果存储在数组中，因为我可能需要在另一次计算后扩展它们。我可以这样做：

foo = np.zeros((5,5), object)

# one calculation result for index (1, 1)
bar1 = np.zeros((8,2))
foo[1, 1] = bar1

# another calculation result for index (1, 1)
bar2 = np.zeros((5,2))
foo[1, 1] = np.concatenate((foo[1, 1], bar2))

然而这对我来说似乎很奇怪，因为我必须做很多检查数组是否已经在这个地方获得了一个值。此外，我不知道使用 object 作为数据类型是否是个好主意，因为我只想存储 numpy 特定数据而不是任何 python 对象。

这种方法有更具体的方法吗？

【问题讨论】：

看起来使用链表（python 中的列表）的哈希表（python 中的 dict）是一种更好的方法。对于哈希，您可以使用 2D->1D 数组样式索引，例如 [1,1] -> 1*5 + 1

标签： python arrays numpy

【解决方案1】：

defaultdict 简化了向 dict 元素增量添加值的任务：

In [644]: from collections import defaultdict

从一个默认值为列表的字典开始，[]。

In [645]: dd = defaultdict(list)
In [646]: dd[(1,1)].append(np.zeros((1,2),int))
In [647]: dd[(1,1)].append(np.ones((3,2),int))
In [648]: dd
Out[648]: 
defaultdict(list,
            {(1, 1): [array([[0, 0]]), array([[1, 1],
                     [1, 1],
                     [1, 1]])]})

收集完所有值后，我们可以将嵌套列表转换为数组：

In [649]: dd[(1,1)] = np.concatenate(dd[(1,1)])
In [650]: dd
Out[650]: 
defaultdict(list,
            {(1, 1): array([[0, 0],
                    [1, 1],
                    [1, 1],
                    [1, 1]])})
In [652]: dict(dd)
Out[652]: 
{(1,
  1): array([[0, 0],
        [1, 1],
        [1, 1],
        [1, 1]])}

在进行转换时，我们必须注意带有[] 的键，因为我们无法连接空列表。

【讨论】：