【问题标题】：How to do element-wise rounding of NumPy array to first non-zero digit?如何将 NumPy 数组按元素舍入到第一个非零数字？
【发布时间】：2019-10-23 21:15:53
【问题描述】：

我想通过以下方式“舍入”（不精确的数学舍入）numpy 数组的元素：

给定一个数字在 0.00001 到 9.99999 之间的 numpy NxN 或 NxM 二维数组，如

 a=np.array([[1.232, 1.872,2.732,0.123],
             [0.0019, 0.025, 1.854, 0.00017],
             [1.457, 0.0021, 2.34 , 9.99],
             [1.527, 3.3, 0.012 , 0.005]]
    )

我基本上想通过选择每个元素的第一个非零数字（不管第一个非零数字后面的数字）来“四舍五入”这个 numpy 数组给出输出：

output =np.array([[1.0, 1.0, 2.0, 0.1],
                 [0.001, 0.02, 1.0, 0.0001],
                 [1.0, 0.002, 2 , 9.0],
                 [1, 3, 0.01 , 0.005]]
        )

感谢您的帮助！

【问题讨论】：

使用 numpy.around
在 numpy around 中无法正常工作，因为当我设置值 numpy.around(a, decimals=0) 时，数字 0.0019 、 0.025 、 0.00017 等它们都变为 0 而不是 0.001、0.02、 0.0001
Rounding to significant figures in numpy的可能重复

标签： python arrays numpy rounding

【解决方案1】：

您可以使用np.logspace 和np.seachsorted 来确定每个元素的数量级，然后再除以乘回

po10 = np.logspace(-10,10,21)
oom = po10[po10.searchsorted(a)-1]
a//oom*oom
# array([[1.e+00, 1.e+00, 2.e+00, 1.e-01],
#        [1.e-03, 2.e-02, 1.e+00, 1.e-04],
#        [1.e+00, 2.e-03, 2.e+00, 9.e+00],
#        [1.e+00, 3.e+00, 1.e-02, 5.e-03]])

【讨论】：

这会比仅计算对数更快/在数值上更稳定吗？
@norok2 取决于问题的大小。日志空间只需要计算一次，并且由于它只有少数条目，因此搜索排序肯定比获取日志和功率更快。您甚至可以进一步推动它，而不是使用日志空间 10^-10、10^-9、10^-8，而是使用所有可能的结果，即xpo10 = 1 x 10^-10, 2 x 10^-10, 3 x 10^-10, ... 9 x 10^-10, 1 x 10^-9, 2 x 10^-9 等等。仍然是一个非常易于管理的术语数量，我们不必再做任何算术运算，只需 xpo10[xpo10.searchsorted(a)-1]

【解决方案2】：

您想要做的是保持固定数量的significant figures。

此功能未集成到 NumPy 中。

要仅获得 1 个有效数字，您可以查看 @PaulPanzer 或 @darcamo 答案（假设您只有正值）。

如果您想要使用指定数量的有效数字的东西，您可以使用类似的东西：

def significant_figures(arr, num=1):
    # : compute the order of magnitude
    order = np.zeros_like(arr)  
    mask = arr != 0
    order[mask] = np.floor(np.log10(np.abs(arr[mask])))
    del mask  # free unused memory
    # : compute the corresponding precision
    prec = num - order - 1
    return np.round(arr * 10.0 ** prec) / 10.0 ** prec


print(significant_figures(a, 1))
# [[1.e+00 2.e+00 3.e+00 1.e-01]
#  [2.e-03 2.e-02 2.e+00 2.e-04]
#  [1.e+00 2.e-03 2.e+00 1.e+01]
#  [2.e+00 3.e+00 1.e-02 5.e-03]]

print(significant_figures(a, 2))
# [[1.2e+00 1.9e+00 2.7e+00 1.2e-01]
#  [1.9e-03 2.5e-02 1.9e+00 1.7e-04]
#  [1.5e+00 2.1e-03 2.3e+00 1.0e+01]
#  [1.5e+00 3.3e+00 1.2e-02 5.0e-03]]

编辑

对于截断的输出，在 return 之前使用 np.floor() 而不是 np.round()。

【讨论】：

小挑剔：OP 说“四舍五入”，但他们真正做的是截断，即总是向下/向零舍入。
@PaulPanzer 感谢您发现这一点！固定。

【解决方案3】：

首先得到数组中每个数字的 10 的幂

powers = np.floor(np.log10(a))

在你的例子中，这给了我们

array([[ 0.,  0.,  0., -1.],
       [-3., -2.,  0., -4.],
       [ 0., -3.,  0.,  0.],
       [ 0.,  0., -2., -3.]])

现在，如果我们将数组中的i-th 元素除以10**power_i，我们实际上将数组中的每个数字非零元素移动到第一个位置。现在我们可以简单地删除其他非零数字，然后将结果乘以 10**power_i 以恢复原始比例。

完整的解决方案就是下面的代码

powers = np.floor(np.log10(a))
10**powers * np.floor(a/10**powers)

大于等于 10 的数字呢？

为此，您可以简单地取数组中原始值的np.floor。我们可以用面具轻松做到这一点。您可以修改答案如下

powers = np.floor(np.log10(a))
result = 10**powers * np.floor(a/10**powers)

mask = a >= 10
result[mask] = np.floor(a[mask])

您还可以使用掩码来避免计算稍后将被替换的数字的幂和对数。

【讨论】：

谢谢！工作非常好和简单。有没有办法用大于 10 的数字来概括这个过程，假设是 12.32 、 15.78 、 121.34 得到 12 、 15 、 121 吗？
注意这里a可能没有非正值。
我已经用一个部分编辑了答案，使其适用于数字 >= 10。