【问题标题】:How can a bin width be made consistent between multiple matplotlib histograms?如何使多个 matplotlib 直方图之间的 bin 宽度保持一致?
【发布时间】:2016-04-26 09:03:15
【问题描述】:

我有一个小函数,旨在获取两个数字列表,并使用它们各自叠加的直方图和比率图来比较它们。比率图的 bin 宽度与重叠直方图的 bin 宽度不一致。如何使比率图的 bin 宽度与叠加直方图的宽度相同?

import numpy
import matplotlib.pyplot
import datavision # sudo pip install datavision
import shijian    # sudo pip install shijian

def main():

    a = numpy.random.normal(2, 2, size = 120)
    b = numpy.random.normal(2, 2, size = 120)

    save_histogram_comparison_matplotlib(
        values_1      = a,
        values_2      = b,
        label_1       = "a",
        label_2       = "b",
        normalize     = True,
        label_ratio_x = "frequency",
        label_y       = "",
        title         = "comparison of a and b",
        filename      = "test.png"
    )

def save_histogram_comparison_matplotlib(
    values_1       = None,
    values_2       = None,
    filename       = None,
    number_of_bins = None,
    normalize      = True,
    label_x        = "",
    label_y        = None,
    label_ratio_x  = "frequency",
    label_ratio_y  = "ratio",
    title          = None,
    label_1        = "1",
    label_2        = "2",
    overwrite      = True,
    LaTeX          = False
    ):

    matplotlib.pyplot.ioff()
    if LaTeX is True:
        matplotlib.pyplot.rc("text", usetex = True)
        matplotlib.pyplot.rc("font", family = "serif")
    if number_of_bins is None:
        number_of_bins_1 = datavision.propose_number_of_bins(values_1)
        number_of_bins_2 = datavision.propose_number_of_bins(values_2)
        number_of_bins   = int((number_of_bins_1 + number_of_bins_2) / 2)
    if filename is None:
        filename = shijian.propose_filename(
            filename  = title.replace(" ", "_") + ".png",
            overwrite = overwrite
        )

    values = []
    values.append(values_1)
    values.append(values_2)
    figure, (axis_1, axis_2) = matplotlib.pyplot.subplots(nrows = 2)
    ns, bins, patches = axis_1.hist(
        values,
        normed   = normalize,
        histtype = "stepfilled",
        bins     = number_of_bins,
        alpha    = 0.5,
        label    = [label_1, label_2]
    )
    axis_1.legend()
    axis_2.bar(
        bins[:-1],
        ns[0] / ns[1],
        alpha = 1,
    )
    axis_1.set_xlabel(label_x)
    axis_1.set_ylabel(label_y)
    axis_2.set_xlabel(label_ratio_x)
    axis_2.set_ylabel(label_ratio_y)
    matplotlib.pyplot.title(title)
    matplotlib.pyplot.savefig(filename)
    matplotlib.pyplot.close()

if __name__ == "__main__":
    main()

编辑:临时草稿板,因为 cmets 中的编码不合理

import numpy
import matplotlib.pyplot
import datavision
import shijian

def main():

    a = numpy.random.normal(2, 2, size = 120)
    b = numpy.random.normal(2, 2, size = 120)

    save_histogram_comparison_matplotlib(
        values_1      = a,
        values_2      = b,
        label_1       = "a",
        label_2       = "b",
        normalize     = True,
        label_ratio_x = "frequency",
        label_y       = "",
        title         = "comparison of a and b",
        filename      = "test.png"
    )

def save_histogram_comparison_matplotlib(
    values_1       = None,
    values_2       = None,
    filename       = None,
    number_of_bins = None,
    normalize      = True,
    label_x        = "",
    label_y        = None,
    label_ratio_x  = "frequency",
    label_ratio_y  = "ratio",
    title          = None,
    label_1        = "1",
    label_2        = "2",
    overwrite      = True,
    LaTeX          = False
    ):

    matplotlib.pyplot.ioff()
    if LaTeX is True:
        matplotlib.pyplot.rc("text", usetex = True)
        matplotlib.pyplot.rc("font", family = "serif")
    if number_of_bins is None:
        number_of_bins_1 = datavision.propose_number_of_bins(values_1)
        number_of_bins_2 = datavision.propose_number_of_bins(values_2)
        number_of_bins   = int((number_of_bins_1 + number_of_bins_2) / 2)
    if filename is None:
        filename = shijian.propose_filename(
            filename  = title.replace(" ", "_") + ".png",
            overwrite = overwrite
        )

    bar_width = 1
    values = []
    values.append(values_1)
    values.append(values_2)
    figure, (axis_1, axis_2) = matplotlib.pyplot.subplots(nrows = 2)
    ns, bins, patches = axis_1.hist(
        values,
        normed   = normalize,
        histtype = "stepfilled",
        bins     = number_of_bins,
        alpha    = 0.5,
        label    = [label_1, label_2],
        rwidth   = bar_width
    )
    axis_1.legend()
    axis_2.bar(
        bins[:-1],
        ns[0] / ns[1],
        alpha = 1,
        width = bar_width
    )
    axis_1.set_xlabel(label_x)
    axis_1.set_ylabel(label_y)
    axis_2.set_xlabel(label_ratio_x)
    axis_2.set_ylabel(label_ratio_y)
    matplotlib.pyplot.title(title)
    matplotlib.pyplot.savefig(filename)
    matplotlib.pyplot.close()

if __name__ == "__main__":
    main()

【问题讨论】:

  • 您是否尝试过将rwidth=1 传递给histwidth=1 传递给bar?假设每个 bin 的宽度为1rwidth 中的 r 表示“相对”。
  • @AndrasDeak 感谢您的评论。我不熟悉这些选项。假设我已经绘制了直方图叠加层,那么我怎么知道要为 bar 赋予 width 什么值?
  • 由于您似乎想要连续绘制条形图,您可能需要在hist 中设置rwidth=1 并在bar 中相应地设置width=bins[1]-bins[0] 或类似的东西。您可能还需要调整任一图中条形的对齐方式。

标签: matplotlib histogram binning


【解决方案1】:

您需要在 axis_1.hist(..) 调用中使用 rwidth 参数

您可以调整 rwidthbins 以匹配您的 axis_2.bar(...) 调用(bar 中的默认宽度为 0.8)。

例如

matplotlib.pyplot.hist(a,bins=6,rwidth=0.8)

【讨论】:

  • 你好。感谢您对此的帮助。我不确定我是否跟随。我已经尝试过了(上面“草稿板”部分中显示的代码 - 可以随意编辑)并且 bin 宽度不相等。假设我已经绘制了直方图叠加层,那么我怎么知道要为 bar 赋予 width 什么值?