【问题标题】:Difference between normed plt.xcorr at 0-lag and np.corrcoef0-lag 的规范化 plt.xcorr 和 np.corrcoef 之间的差异
【发布时间】:2017-01-30 09:49:03
【问题描述】:

我正在研究两个相对较小的时间序列之间的互相关,但在尝试完成时,我遇到了一个我无法协调自己的问题。首先,我了解plt.xcorrnp.correlate 之间的依赖关系。但是,我无法协调 plt.xcorr 零延迟和 np.corrcoef 之间的差异?

a = np.array([  7.35846410e+08,   8.96271634e+08,   6.16249222e+08,
     8.00739868e+08,   1.06116376e+09,   9.05690167e+08,
     6.31383600e+08])
b = np.array([  1.95621617e+09,   2.06263134e+09,   2.27717015e+09,
     2.27281916e+09,   2.71090116e+09,   2.84676385e+09,
     3.19578883e+09])

np.corrcoef(a,b)
# returns:
array([[ 1.        ,  0.02099573],
      [ 0.02099573,  1.        ]])

plt.xcorr(a,b,normed=True, maxlags=1)
# returns:
array([-1,  0,  1]),
 array([ 0.90510941,  0.97024415,  0.79874158])

我希望这些返回相同的结果。我显然不明白plt.xcorr 是如何被规范的,有人可以帮我澄清一下吗?

【问题讨论】:

    标签: python numpy correlation cross-correlation


    【解决方案1】:

    我用http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr

    normed : 布尔值,可选,默认值:True

    如果为 True,则通过第 0 个滞后的自相关对数据进行归一化。

    在以下代码中,plt_corr 等于 np_corr

    plt_corr = plt.xcorr(a, b, normed=True, maxlags=6)
    
    c = np.correlate(a, a)  # autocorrelation of a
    d = np.correlate(b, b)  # autocorrelation of b
    np_corr = np.correlate(a/np.sqrt(c), b/np.sqrt(d), 'full')
    

    【讨论】:

      【解决方案2】:

      标准“皮尔逊积矩相关系数”的计算使用样本,按平均值移动。 互相关系数不使用归一化样本。 除此之外,计算是相似的。但是这些系数仍然有不同的公式和不同的含义。仅当样本 ab 的平均值等于 0 时它们才相等(如果按平均值移动不会改变样本。

      import numpy as np
      import matplotlib.pyplot as plt
      
      a = np.array([7.35846410e+08, 8.96271634e+08, 6.16249222e+08,
           8.00739868e+08, 1.06116376e+09, 9.05690167e+08, 6.31383600e+08])
      b = np.array([1.95621617e+09, 2.06263134e+09, 2.27717015e+09,
           2.27281916e+09, 2.71090116e+09, 2.84676385e+09, 3.19578883e+09])
      
      y = np.corrcoef(a, b)
      z = plt.xcorr(a, b, normed=True, maxlags=1)
      print("Pearson product-moment correlation coefficient between `a` and `b`:", y[0][1])
      print("Cross-correlation coefficient between `a` and `b` with 0-lag:", z[1][1], "\n")
      
      
      # Calculate manually:
      
      def pearson(a, b):
          # Length.
          n = len(a)
      
          # Means.
          ma = sum(a) / n
          mb = sum(b) / n
      
          # Shifted samples.
          _ama = a - ma
          _bmb = b - mb
      
          # Standard deviations.
          sa = np.sqrt(np.dot(_ama, _ama) / n)
          sb = np.sqrt(np.dot(_bmb, _bmb) / n)
      
          # Covariation.
          cov = np.dot(_ama, _bmb) / n
      
          # Final formula.
          # Note: division by `n` in deviations and covariation cancel out each other in
          #       final formula and could be ignored.
          return cov / (sa * sb)
      
      def cross0lag(a, b):
          return np.dot(a, b) / np.sqrt(np.dot(a, a) * np.dot(b, b))
      
      pearson_coeff = pearson(a, b)
      cross_coeff = cross0lag(a, b)
      
      print("Manually calculated coefficients:")
      print("  Pearson =", pearson_coeff)
      print("  Cross   =", cross_coeff, "\n")
      
      
      # Normalized samples:
      am0 = a - sum(a) / len(a)
      bm0 = b - sum(b) / len(b)
      pearson_coeff = pearson(am0, bm0)
      cross_coeff = cross0lag(am0, bm0)
      print("Coefficients for samples with means = 0:")
      print("  Pearson =", pearson_coeff)
      print("  Cross   =", cross_coeff)
      

      输出:

      Pearson product-moment correlation coefficient between `a` and `b`: 0.020995727082
      Cross-correlation coefficient between `a` and `b` with 0-lag: 0.970244146831 
      
      Manually calculated coefficients:
        Pearson = 0.020995727082
        Cross   = 0.970244146831 
      
      Coefficients for samples with means = 0:
        Pearson = 0.020995727082
        Cross   = 0.020995727082
      

      【讨论】:

      • 但我认为plt.xcorr()normed=True 参数的目的是规范化输入向量... plt.xcorr 的“规范化输入向量”与之前的规范化有何不同计算皮尔逊 r?
      【解决方案3】:

      正如 DJV 的回答所说,在 plt.xcorr 上,normed=True 仅对幅度进行归一化。如果您还想标准化为均值 = 0,就像对 Pearson r 所做的那样,您可以添加参数 detrend=mlab.detrend_mean

      import matplotlib.pyplot as plt
      import matplotlib.mlab as mlab
      
      plt.xcorr(a, b, normed=True, maxlags=1, detrend=mlab.detrend_mean)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-11-07
        • 2021-07-04
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-12-21
        • 2021-01-23
        • 1970-01-01
        相关资源
        最近更新 更多