【问题标题】:Fitting a Gaussian to a a 1D masked data array将高斯拟合到 1D 掩码数据数组
【发布时间】:2018-09-11 03:49:30
【问题描述】:

我有一个屏蔽的一维数据数组,其中包含我屏蔽的 nan 值,现在打印为 --。我希望将此数组拟合为高斯并使用拟合的均值和标准差创建直方图。我已经尝试过辣味.stats.fit 但这没有用(平均和标准只返回'nan')。然后我寻找了辣味.mstats,但它看起来不像有适合的功能。

是否有可以将高斯拟合到掩码数组并输出均值和标准差的模块?

编辑:这是我的代码

def createRmsMatrix( self ):

    '''
    Creates an array of RMS values for each profile in one file.
    '''

    # Initialize RMS table of zeroes
    rmsMatrix = np.zeros( ( self.nSub, self.nChan ), dtype = float )

    # Loop over the time and frequency indices
    for time in np.arange( self.nSub ):
        for frequency in np.arange( self.nChan ):

            # Create a mask along the bin space
            mask = utils.binMask( self.data[time][frequency], 0.55 )

            #print(mask)

            rmsMatrix[time][frequency] = mu.rootMeanSquare( self.data[time][frequency][mask == 0] )

    # Mask the nan values in the array
    rmsMatrix = np.ma.array( rmsMatrix, mask = np.isnan( rmsMatrix ) )

    print( "Root Mean Square matrix created..." )

    return rmsMatrix

我的 main 函数中调用它的部分是:

    # Return the array of RMS values for each profile
    self.rmsArray = self.createRmsMatrix()

    # Reshape RMS array to be linear and store in a new RMS array
    self.linearRmsArray = np.reshape( self.rmsArray, ( self.nChan * self.nSub ) )

    # Best fit of data using a Gaussian fit
    mu, sigma = norm.fit( self.linearRmsArray )

    # Creates the histogram
    n, bins, patches = self.histogramPlot( self.linearRmsArray, mu, sigma, 'Root Mean Squared', 'Frequency Density', True )

histogramPlot 对我来说只是一个方便的 matplotlib 组织者,我还将发布:

def histogramPlot( self, data, mean, stdDev, xAxis='x-axis', yAxis='y-axis', showPlot = False ):

    '''
    Plots and returns a histogram of some linear data using matplotlib
    and fits a Gaussian centered around the mean with a spread of stdDev.
    Use this function to set the x and y axis names.
    Can also toggle showing of the histogram in this function.
    '''

    # Plot the histogram
    n, bins, patches = plt.hist( self.linearRmsArray, bins=self.nChan, normed=True )

    # Add a 'best fit' normal distribution line
    xPlot = np.linspace( ( mean - (4*stdDev) ), ( mean + (4*stdDev) ), 1000 )
    yPlot = mlab.normpdf( xPlot, mean, stdDev )
    l = plt.plot(xPlot, yPlot, 'r--', linewidth=2)

    # Format axes
    plt.ylabel( yAxis )
    plt.xlabel( xAxis )
    #plt.title(r'$\mathrm{Histogram\ of\ data:}\ \mu=%.3f,\ \sigma=%.3f$' %(mu, sigma))
    plt.title(r'$\mu=%.3f,\ \sigma=%.3f$' %(mean, stdDev))
    plt.grid(True)

    if showPlot == True:
        plt.show()

    return n, bins, patches

【问题讨论】:

  • 目前有代码吗?至少生成与提到的“屏蔽数组”等效的代码?
  • @AndreyTyukin,我已将代码包含在编辑中

标签: python python-3.x scipy histogram python-3.6


【解决方案1】:

您试图使用scipy.norm.fit 来拟合数据的正态分布,这意味着您的输入是应该是正态分布的随机样本的值集合。在这种情况下,平均值和标准差的最大似然估计。开发。只是数据的样本均值和样本标准差。对于包含nan的数据,可以在调用scipy.norm.fit()之前去掉nans,或者直接用numpy.nanmeannumpy.nanstd计算:

est_mean = np.nanmean(data)
est_stddev = np.nanstd(data)

例如,

In [18]: import numpy as np

In [19]: from scipy.stats import norm

In [20]: x = np.array([1, 4.5, np.nan, 3.3, 10.0, 4.1, 8.5, 17.1, np.nan])

In [21]: np.nanmean(x), np.nanstd(x)
Out[21]: (6.9285714285714288, 5.0366412520687653)

In [22]: norm.fit(x[np.isfinite(x)])
Out[22]: (6.9285714285714288, 5.0366412520687653)

请注意,x[np.isfinite(x)]x 中不是 naninf 的值的数组。

如果你有一个掩码数组,你可以使用meanstd 方法:

In [36]: mx = np.ma.masked_array(x, np.isnan(x))

In [37]: mx
Out[37]: 
masked_array(data = [1.0 4.5 -- 3.3 10.0 4.1 8.5 17.1 --],
             mask = [False False  True False False False False False  True],
       fill_value = 1e+20)

In [38]: mx.mean(), mx.std()
Out[38]: (6.9285714285714288, 5.0366412520687653)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-01-17
    • 2015-10-10
    • 1970-01-01
    • 2023-03-19
    • 2023-03-10
    • 1970-01-01
    相关资源
    最近更新 更多