【发布时间】:2020-09-21 15:26:25
【问题描述】:
我有一个类似这样的 4D numpy 数组:
>>>import numpy as np
>>>from functools import partial
>>>X = np.random.rand(20, 1, 10, 4)
>>>X.shape
(20, 1, 10, 4)
我计算如下统计mean, median, std, p25, p75
>>>percentiles = tuple(partial(np.percentile, q=q) for q in (25,75))
>>>stat_functions = (np.mean, np.std, np.median) + percentiles
>>>stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
这样:
>>>stats.shape
(20, 1, 5, 4)
>>>stats[0]
array([[[0.55187202, 0.55892688, 0.45816177, 0.6378181 ],
[0.31028278, 0.32109677, 0.17319351, 0.13341651],
[0.57112019, 0.60587194, 0.45490572, 0.59787335],
[0.30857011, 0.30367621, 0.28899686, 0.55742753],
[0.80678815, 0.82014851, 0.61295181, 0.70529412]]])
我对统计中的mad感兴趣,所以我定义了这个函数,因为它不适用于numpy。
def mad(data):
mean = np.mean(data)
f = lambda x: abs(x - mean)
vf = np.vectorize(f)
return (np.add.reduce(vf(data))) / len(data)
但是我在让这个函数工作时遇到了问题:首先我尝试了:
>>>stat_functions = (np.mean, np.std, np.median, mad) + percentiles
>>>stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-fa6d972f0fce> in <module>()
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
<ipython-input-33-fa6d972f0fce> in <listcomp>(.0)
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
TypeError: mad() got an unexpected keyword argument 'axis'
然后我将mad的定义修改为:
def mad(data, axis=None):
...
进入这个问题:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-35-c74d9e3d057b> in <module>()
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
<ipython-input-35-c74d9e3d057b> in <listcomp>(.0)
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
TypeError: mad() got an unexpected keyword argument 'keepdims'
所以也这样做:
def mad(data, axis=None, keepdims=None):
...
让我进入:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-c74d9e3d057b> in <module>()
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 4 dimension(s) and the array at index 3 has 3 dimension(s)
我知道这与维度问题有关,但我不确定在这种情况下如何解决它。
*编辑:
根据给出的答案,我在使用答案的mad函数后得到了一个奇怪的结果,像这样:
stat_functions = (np.mean, np.std, np.median,mad) + percentiles
stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
stats.shape
(20, 1, 15, 4)
预期的输出应该具有(20,1,6,4) 的形状,因为我在第三维中添加了一个统计值:(np.mean, np.std, np.median, mad) + percentiles
EDIT-2
使用答案中的这个函数:
def mad(data, axis=-1, keepdims=True):
return np.abs(data - data.mean(axis, keepdims=True)).mean(axis)
然后:
stat_functions = (np.mean, np.std, np.median, mad) + percentiles
stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
然后遇到这个:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-fa6d972f0fce> in <module>()
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 4 dimension(s) and the array at index 3 has 3 dimension(s)
【问题讨论】:
标签: python numpy multidimensional-array numpy-ndarray