numpy 中 numpy.histogram / 随机数的奇怪行为？答案

【问题标题】：weird behavior of numpy.histogram / random numbers in numpy?numpy 中 numpy.histogram / 随机数的奇怪行为？
【发布时间】：2018-05-27 21:48:49
【问题描述】：

我偶然发现了 Python 中随机数的一些特殊行为，特别是我使用了模块 numpy.random。

考虑以下表达式：

n = 50
N = 1000
np.histogram(np.sum(np.random.randint(0, 2, size=(n, N)), axis=0), bins=n+1)[0]

在大的N 的限制下，我期望一个二项分布（对于感兴趣的读者，这模拟了Ehrenfest model）和大的n 一个正态分布。然而，典型的输出如下所示：

数组（[
1, 0, 0, 1, 0, 2, 0, 1, 0, 15, 0,
12, 0, 18, 0, 39, 0, 64, 0, 62, 0, 109,
0, 97, 0, 107, 0, 114, 0, 102, 0, 92, 0,
55, 0, 46, 0, 35, 0, 10, 0, 9, 0, 4,
0, 0, 0, 3, 0, 1, 1
])

根据上面的陈述，我无法解释直方图中零点的出现 - 我在这里遗漏了一些明显的东西吗？

【问题讨论】：

标签： python numpy random statistics

【解决方案1】：

你用错了histogram。垃圾箱不在您认为的位置。它们不是从 0 到 50；而是从 0 到 50。它们从最小输入值到最大输入值。 0 表示完全位于两个整数之间的 bin。

试试numpy.bincount:

In [31]: n = 50

In [32]: N = 5000

In [33]: np.bincount(np.sum(np.random.randint(0, 2, size=(n, N)), axis=0))
Out[33]: 
array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   7,  13,  22,  46,  75, 126, 220, 305, 367, 461, 550, 578,
       517, 471, 438, 314, 189, 146,  76,  50,  17,   9,   2,   1])

【讨论】：

为了完整起见：将minlength=n+1 传递给np.bincount 以实现整个跨度的列表时，实现了所需的行为。