【问题标题】:Python PCA - projection into lower dimensional spacePython PCA - 投影到低维空间
【发布时间】:2016-04-21 13:43:53
【问题描述】:

我正在尝试实现 PCA,它在特征值和特征向量等中间结果方面效果很好。然而,当我尝试将数据(3 维)投影到 2D 主成分空间时,结果是错误的。 我花了很多时间将我的代码与其他实现进行比较,例如:

http://sebastianraschka.com/Articles/2014_pca_step_by_step.html

然而过了很长时间没有进展,我找不到错误。我认为由于正确的中间结果,问题是一个简单的编码错误。 提前感谢任何实际阅读此问题的人,并更感谢那些提供有用的 cmets/answers 的人。

我的代码如下:

import numpy as np

class PCA():   
def __init__(self, X):           
    #center the data        
    X = X - X.mean(axis=0)         
    #calculate covariance matrix based on X where data points are represented in rows
    C = np.cov(X, rowvar=False)    
    #get eigenvectors and eigenvalues
    d,u = np.linalg.eigh(C)        
    #sort both eigenvectors and eigenvalues descending regarding the eigenvalue
    #the output of np.linalg.eigh is sorted ascending, therefore both are turned around to reach a descending order
    self.U = np.asarray(u).T[::-1]    
    self.D = d[::-1]

**problem starts here**       

def project(self, X, m):
    #use the top m eigenvectors with the highest eigenvalues for the transformation matrix
    Z = np.dot(X,np.asmatrix(self.U[:m]).T)
    return Z

我的代码结果是:

 myresult
 ([[ 0.03463706, -2.65447128],
   [-1.52656731,  0.20025725],
   [-3.82672364,  0.88865609],
   [ 2.22969475,  0.05126909],
   [-1.56296316, -2.22932369],
   [ 1.59059825,  0.63988429],
   [ 0.62786254, -0.61449831],
   [ 0.59657118,  0.51004927]])

correct result - such as by sklearn.PCA
([[ 0.26424835, -2.25344912],
 [-1.29695602,  0.60127941],
 [-3.59711235,  1.28967825],
 [ 2.45930604,  0.45229125],
 [-1.33335186, -1.82830153],
 [ 1.82020954,  1.04090645],
 [ 0.85747383, -0.21347615],
 [ 0.82618248,  0.91107143]])

The input is defined as follows: 
X = np.array([
[-2.133268233289599,0.903819474847349,2.217823388231679,-0.444779660856219,-0.661480010318842,-0.163814281248453,-0.608167714051449, 0.949391996219125],
[-1.273486742804804,-1.270450725314960,-2.873297536940942, 1.819616794091556,-2.617784834189455, 1.706200163080549,0.196983250752276,0.501491995499840],
[-0.935406638147949,0.298594472836292,1.520579082270122,-1.390457671168661,-1.180253547776717,-0.194988736923602,-0.645052874385757,-1.400566775105519]]).T 

【问题讨论】:

    标签: python numpy matrix projection pca


    【解决方案1】:

    在将数据投影到新的基础上之前,您需要通过减去平均值来使数据居中:

    mu = X.mean(0)
    C = np.cov(X - mu, rowvar=False)
    d, u = np.linalg.eigh(C)
    U = u.T[::-1]
    Z = np.dot(X - mu, U[:2].T)
    
    print(Z)
    # [[ 0.26424835 -2.25344912]
    #  [-1.29695602  0.60127941]
    #  [-3.59711235  1.28967825]
    #  [ 2.45930604  0.45229125]
    #  [-1.33335186 -1.82830153]
    #  [ 1.82020954  1.04090645]
    #  [ 0.85747383 -0.21347615]
    #  [ 0.82618248  0.91107143]]
    

    【讨论】:

      猜你喜欢
      • 2016-01-10
      • 2021-01-11
      • 1970-01-01
      • 2018-09-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-11-03
      相关资源
      最近更新 更多