【问题标题】:Generate probability distribution or smoothing plot from points containing probabilities从包含概率的点生成概率分布或平滑图
【发布时间】:2020-03-07 23:16:16
【问题描述】:

我的点包括 y 轴上的概率和 x 轴上的值,例如:

p1 =
[[0.0, 0.0001430560406790707],
[10.0, 6.2797052001508247e-13],
[15.0, 4.8114669550502021e-06],
[20.0, 0.0007443231772534647],
[25.0, 0.00061070912573869406],
[30.0, 0.48116582167944905],
[35.0, 0.24698643991977953],
[40.0, 0.016407283121225951],
[45.0, 0.2557158314329116],
[50.0, 1.1252231121357235e-05],
[55.0, 0.064666668633158647],
[60.0, 1.7631447655837744e-17],
[65.0, 1.1294722466816786e-14],
[70.0, 2.9419020411134367e-16],
[75.0, 3.0887653014525822e-17],
[80.0, 4.4973693062706866e-17],
[85.0, 9.0975358174005147e-15],
[90.0, 1.0758266454985257e-10],
[95.0, 7.2923752473657924e-08],
[100.0, 1.8065366882584036e-08]]

p2 =
[[0.0, 4.1652247577331996e-06],
[10.0, 1.2212829713673957e-06],
[15.0, 6.5906857192417344e-08],
[20.0, 0.00016745946587138236],
[25.0, 0.0054431111796765554],
[30.0, 0.0067575214586160616],
[35.0, 0.00011856110316632124],
[40.0, 0.00032181662132509944],
[45.0, 0.001397981055516994],
[50.0, 0.0027058954834684062],
[55.0, 2.553142406703067e-06],
[60.0, 1.1514033594755017e-08],
[65.0, 0.21961568282994792],
[70.0, 2.4658349829099807e-08],
[75.0, 0.0022850986575076743],
[80.0, 3.5603047823624507e-06],
[85.0, 0.99406392082894734],
[90.0, 0.24399923235645221],
[95.0, 0.0013470125217945798],
[100.0, 0.042582366972883985]] 

现在我想从点生成概率分布,其中 x 轴值为 (0,10,15,20,...,100),y 轴值包含概率 (0.00014,. ...)

当使用plt.plot 函数时,我得到:

plt.plot([item[0] for item in p1],[item[1] for item in p1])

对于 p2:

plt.plot([item[0] for item in p2],[item[1] for item in p2])

我想获得更平滑的可视化,比如概率分布:

如果不可能进行概率分布,则使用平滑样条:

【问题讨论】:

    标签: python matplotlib statistics probability probability-distribution


    【解决方案1】:

    Scipy 的gaussian_kde 通常用于平滑近似概率分布。它对每个输入点求和一个高斯核。通常将单个测量值用作输入,但 weights 参数允许使用分箱数据。该函数被归一化,使其积分等于一。

    这种方法假定 p1 和 p2 的值是每个 x 值周围的段的平均值,类似于直方图。 IE。一个阶跃函数,其中 x 值标识每一步的结束。

    from matplotlib import pyplot as plt
    import numpy as np
    from scipy.stats import gaussian_kde
    
    p1 = np.array([[0.0, 0.0001430560406790707],
                   [10.0, 6.2797052001508247e-13],
                   [15.0, 4.8114669550502021e-06],
                   [20.0, 0.0007443231772534647],
                   [25.0, 0.00061070912573869406],
                   [30.0, 0.48116582167944905],
                   [35.0, 0.24698643991977953],
                   [40.0, 0.016407283121225951],
                   [45.0, 0.2557158314329116],
                   [50.0, 1.1252231121357235e-05],
                   [55.0, 0.064666668633158647],
                   [60.0, 1.7631447655837744e-17],
                   [65.0, 1.1294722466816786e-14],
                   [70.0, 2.9419020411134367e-16],
                   [75.0, 3.0887653014525822e-17],
                   [80.0, 4.4973693062706866e-17],
                   [85.0, 9.0975358174005147e-15],
                   [90.0, 1.0758266454985257e-10],
                   [95.0, 7.2923752473657924e-08],
                   [100.0, 1.8065366882584036e-08]])
    p2 = np.array([[0.0, 4.1652247577331996e-06],
                   [10.0, 1.2212829713673957e-06],
                   [15.0, 6.5906857192417344e-08],
                   [20.0, 0.00016745946587138236],
                   [25.0, 0.0054431111796765554],
                   [30.0, 0.0067575214586160616],
                   [35.0, 0.00011856110316632124],
                   [40.0, 0.00032181662132509944],
                   [45.0, 0.001397981055516994],
                   [50.0, 0.0027058954834684062],
                   [55.0, 2.553142406703067e-06],
                   [60.0, 1.1514033594755017e-08],
                   [65.0, 0.21961568282994792],
                   [70.0, 2.4658349829099807e-08],
                   [75.0, 0.0022850986575076743],
                   [80.0, 3.5603047823624507e-06],
                   [85.0, 0.99406392082894734],
                   [90.0, 0.24399923235645221],
                   [95.0, 0.0013470125217945798],
                   [100.0, 0.042582366972883985]])
    x = np.linspace(0, 100, 1000)
    fig, axes = plt.subplots(ncols=2)
    for ax, p in zip(axes, [p1, p2]):
        p[0, 0] = 5.0  # let each x-value be the end of a segment
        ax.step(p[:,0], p[:,1], color='dodgerblue', lw=1, ls=':', where='pre')
        ax2 = ax.twinx()
        kde = gaussian_kde(p[:,0]-2.5, bw_method=.25, weights=p[:,1])
        ax2.plot(x, kde(x), color='crimson')
    plt.show()
    

    【讨论】:

      猜你喜欢
      • 2016-02-17
      • 1970-01-01
      • 2020-07-02
      • 1970-01-01
      • 1970-01-01
      • 2021-11-05
      • 1970-01-01
      • 2019-06-21
      • 1970-01-01
      相关资源
      最近更新 更多