在 Scipy 中计算 KL 散度时出错答案

【问题标题】：Error computing KL divergence in Scipy在 Scipy 中计算 KL 散度时出错
【发布时间】：2017-10-03 11:52:36
【问题描述】：

我正在尝试使用 scipy 的 entropy 函数计算 KL 散度。

我的p 是：

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

而q 是：

array([[ 0.05242718,  0.04436347,  0.04130855,  0.04878344,  0.04310538,
         0.02856853,  0.03303122,  0.02517992,  0.08525434,  0.03450324,
         0.14580068,  0.1286993 ,  0.28897473],
       [ 0.65421444,  0.11592199,  0.0642645 ,  0.02989768,  0.01385762,
         0.01756484,  0.01024294,  0.00891479,  0.01140301,  0.00718939,
         0.00938009,  0.01070139,  0.04644726],
       [ 0.65984136,  0.13251236,  0.06345234,  0.02891162,  0.02429709,
         0.02025307,  0.01073064,  0.01170066,  0.00678652,  0.00703361,
         0.00560414,  0.00651137,  0.02236522],
       [ 0.32315928,  0.23900077,  0.05460232,  0.03953635,  0.02901102,
         0.01294443,  0.02372061,  0.02092882,  0.01188251,  0.01377188,
         0.02976672,  0.05854314,  0.14313218],
       [ 0.7717858 ,  0.09692616,  0.03415596,  0.01713088,  0.01108141,
         0.0128005 ,  0.00847301,  0.01049734,  0.0052889 ,  0.00514799,
         0.00442508,  0.00485477,  0.01743218]], dtype=float32)

当我这样做时：

entropy(p[0],q[0])

我收到以下错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-201-563ea7d4decf> in <module>()
      4 print('p0:',p[0])
      5 print('q0:',q[0])
----> 6 entropy(p[0],q[0])

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.py in entropy(y, bins)
   1570     y = np.zeros((len(x)+2,), x.dtype)
   1571     y[1:-1] = x
-> 1572     dif = np.diff(y)
   1573     up = (dif == 1).nonzero()[0]
   1574     dn = (dif == -1).nonzero()[0]

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
    781         if (np.diff(bins) < 0).any():
    782             raise ValueError(
--> 783                 'bins must increase monotonically.')
    784 
    785         # Initialize empty histogram

ValueError: bins must increase monotonically.

为什么会这样？

【问题讨论】：

请不要将错误/代码作为屏幕截图发布，而是复制到文本中。
@kazemakase 完成
@VeilEclipse thank you :)

标签： numpy matplotlib machine-learning scipy signal-processing

【解决方案1】：

这适用于示例数组：

import scipy as sp
sp.stats.entropy(p[0], q[0])

查看错误消息中的堆栈跟踪，很明显您没有调用scipy's entropy 函数，而是matplotlib's entropy，它的工作方式不同。以下是相关部分：

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.pyin entropy(y, bins)

【讨论】：

谢谢。我导入了 matplotlib 和明确的 scipy 熵。也许它有冲突
@VeilEclipse 是的，这可能会发生。结果可能取决于导入的顺序和类似的东西。出于这个原因，它通常是一个 bad idea to import * 进入全局命名空间——即使一开始看起来很方便。