尝试在直方图上绘制误差线时出现断言错误答案

【问题标题】：Getting an assertion error when trying to plot error bars on a histogram尝试在直方图上绘制误差线时出现断言错误
【发布时间】：2021-08-23 07:48:15
【问题描述】：

我正在尝试从随机生成的数据中绘制一个包含 30 个 bin 的直方图。我应该补充一点，我对编程还是很陌生。 x值（vals）生成如下：

import STOM_higgs_tools
vals = STOM_higgs_tools.generate_data() # A python list.
#print(vals)
# Each list entry represents the rest mass reconstructed from a collision

然后绘制直方图如下：

import matplotlib.pyplot as plt # Making plots.
import numpy as np # Random number generation.
# Make a histogram.
bin_heights, bin_edges, patches = plt.hist(vals, range = [104, 155], bins = 30)
# Add the error bars
bin_height_sqrt = np.sqrt(bin_heights)
half_bin_width = 0.5*(bin_edges[1] - bin_edges[0])
plt.errorbar(half_bin_width, bin_heights, yerr=bin_height_sqrt, fmt='none')
plt.ylabel('Number of entries')
plt.xlabel('$m_{γγ}$ (GeV)')
plt.show()
# bin_heights and bin_edges are numpy arrays.
# patches are the matplotlib bar objects, which we won’t need.

直方图本身绘制得很好（如果我删除了与尝试绘制误差线相关的代码）。这就是没有误差线的情节。

范围设置为 [104,155] 以仅从该范围内获取数据以绘制在直方图上。误差线的大小应使用箱高度的平方根 (bin_heights) 但是，在尝试添加错误栏时，我得到一个断言错误

Traceback (most recent call last):

  File "C:\Users\Sidharth\Documents\Computing Labs\Computing Lab Session 3\statsgroupproject.py", line 102, in <module>
    plt.errorbar(half_bin_width, bin_heights, yerr=bin_height_sqrt, fmt='none')

  File "C:\Users\Sidharth\anaconda3\lib\site-packages\matplotlib\pyplot.py", line 2524, in errorbar
    return gca().errorbar(

  File "C:\Users\Sidharth\anaconda3\lib\site-packages\matplotlib\__init__.py", line 1565, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)

  File "C:\Users\Sidharth\anaconda3\lib\site-packages\matplotlib\axes\_axes.py", line 3382, in errorbar
    xo, _ = xywhere(x, lower, noylims & everymask)

  File "C:\Users\Sidharth\anaconda3\lib\site-packages\matplotlib\axes\_axes.py", line 3285, in xywhere
    assert len(xs) == len(ys)

AssertionError

我不知道如何解决这个问题。开头导入的模块是定制的，定义如下（如果相关）：

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)

N_b = 10e5 # Number of background events, used in generation and in fit.
b_tau = 30. # Spoiler.

def generate_data(n_signals = 400):
    ''' 
    Generate a set of values for signal and background. Input argument sets 
    the number of signal events, and can be varied (default to higgs-like at 
    announcement). 
    
    The background amplitude is fixed to 9e5 events, and is modelled as an exponential, 
    hard coded width. The signal is modelled as a gaussian on top (again, hard 
    coded width and mu).
    '''
    vals = []
    vals += generate_signal( n_signals, 125., 1.5)
    vals += generate_background( N_b, b_tau)
    return vals


def generate_signal(N, mu, sig):
    ''' 
    Generate N values according to a gaussian distribution.
    '''
    return np.random.normal(loc = mu, scale = sig, size = N).tolist()


def generate_background(N, tau):
    ''' 
    Generate N values according to an exp distribution.
    '''
    return np.random.exponential(scale = tau, size = int(N)).tolist()


def get_B_chi(vals, mass_range, nbins, A, lamb):
    ''' 
    Calculates the chi-square value of the no-signal hypothesis (i.e background
    only) for the passed values. Need an expectation - use the analyic form, 
    using the hard coded scale of the exp. That depends on the binning, so pass 
    in as argument. The mass range must also be set - otherwise, its ignored.
    '''
    bin_heights, bin_edges = np.histogram(vals, range = mass_range, bins = nbins)
    half_bin_width = 0.5*(bin_edges[1] - bin_edges[0])
    ys_expected = get_B_expectation(bin_edges + half_bin_width, A, lamb)
    chi = 0

    # Loop over bins - all of them for now. 
    for i in range( len(bin_heights) ):
        chi_nominator = (bin_heights[i] - ys_expected[i])**2
        chi_denominator = ys_expected[i]
        chi += chi_nominator / chi_denominator
    
    return chi/float(nbins-2) # B has 2 parameters.


def get_B_expectation(xs, A, lamb):
    ''' 
    Return a set of expectation values for the background distribution for the 
    passed in x values. 
    '''
    return [A*np.exp(-x/lamb) for x in xs]


def signal_gaus(x, mu, sig, signal_amp):
    return signal_amp/(np.sqrt(2.*np.pi)*sig)*np.exp(-np.power((x - mu)/sig, 2.)/2)


def get_SB_expectation(xs, A, lamb, mu, sig, signal_amp):
    ys = []
    for x in xs:
        ys.append(A*np.exp(-x/lamb) + signal_gaus(x, mu, sig, signal_amp))
    return ys

编辑 1

编辑 2

# Make a histogram.
bin_no=30
bin_heights, bin_edges, patches = plt.hist(vals, range = [104, 155], bins = bin_no)
# Add the error bars
half_bin_width = []
for i in range(30):
    half_bin_width.append((bin_edges[i]+bin_edges[i+1])*0.5)
bin_height_sqrt = np.sqrt(bin_heights)
plt.errorbar(half_bin_width, bin_heights, yerr=bin_height_sqrt, fmt='none')
plt.ylabel('Number of entries')
plt.xlabel('$m_{γγ}$ (GeV)')
plt.show()

我能够用上面的代码修复它，使用 for 循环来定义垃圾箱的中心。

【问题讨论】：

这是什么版本的matplotlib？面向用户的代码绝对不应该这样断言，those lines are proper checks with ValueErrors now。我在我安装的最新版本的 matplotlib 中看到了相同的代码。

标签： python numpy matplotlib histogram errorbar

【解决方案1】：

您只需稍微修改您的half_bin_width。试试这个：

import matplotlib.pyplot as plt
import numpy as np

vals = []
vals += np.random.normal(loc=125, scale=1.5, size=400).tolist()
vals += np.random.exponential(scale=30, size=100000).tolist()
    
bin_heights, bin_edges, patches = plt.hist(vals, range=[104, 155], bins=30)
half_bin_width = 0.5*(bin_edges[1:] + bin_edges[:-1]) #making x and y the same size
bin_height_sqrt = np.sqrt(bin_heights)
plt.errorbar(half_bin_width, bin_heights, yerr=bin_height_sqrt, fmt="none")
plt.ylabel('Number of entries')
plt.xlabel('$m_{γγ}$ (GeV)')
plt.show()

【讨论】：

这有点工作，但错误“条”都集中在我的实际数据之外。请参阅主要问题中的编辑。
应该是其他问题吧？我调整了您的函数以获取您可能拥有的 val，并且我的图表看起来不错。看看我编辑的回复。
我最终能够修复它。我通过添加一个 for 循环稍微改变了 half_bin_width 的定义。如果您有兴趣了解我做了什么，请检查主要问题中的编辑 2。感谢您的帮助！