【问题标题】:python plotting eigenvectorspython绘制特征向量
【发布时间】:2020-05-09 23:39:10
【问题描述】:

我认为特征向量必须相互正交。以下似乎违反了这一点。我想检查我是否做错了什么。感谢您的任何见解!!!

这是 PCA 的代码(帖子底部的数据)

from numpy import array
from numpy import mean
from numpy import cov
from numpy.linalg import eig


#calculate the mean of each column
M = mean(df.T, axis=1)

# center columns by subtracting column means
C = df - M

# calculate covariance matrix of centered matrix
V = cov(df.T)

# eigendecomposition of covariance matrix
values, vectors = eig(V)

# project data
P = vectors.T.dot(C.T)

#Make a list of (eigenvalue, eigenvector) tuples
eig_pairs = [(np.abs(values[i]), vectors[:,i]) for i in range(len(values))]

# Sort the (eigenvalue, eigenvector) tuples from high to low
eig_pairs.sort(key=lambda x: x[0], reverse=True)


matrix_w = np.hstack((eig_pairs[0][1].reshape(20,1), eig_pairs[1][1].reshape(20,1)))
#print('Matrix W:\n', matrix_w)

我在这里绘制特征向量所做的只是抓取 matrix_w 的前两行。它是否正确?我只是手动将它们输入到数组 M 中。是我的 matrix_w 错误还是前两个主成分的向量选择不正确?

M =  np.array([[0.00747255,  0.16222854],[-0.18394907,  0.12426324]])
rows,cols = M.T.shape

#Get absolute maxes for axis ranges to center origin
maxes = 1.1*np.amax(abs(M), axis = 0)

for i,l in enumerate(range(0,cols)):
    xs = [0,M[i,0]]
    ys = [0,M[i,1]]
    plt.plot(xs,ys)

plt.plot(0,0,'ok') #<-- plot a black point at the origin
plt.axis('equal')  #<-- set the axes to the same scale

plt.legend(['V'+str(i+1) for i in range(cols)]) #<-- give a legend
plt.grid(b=True, which='major') #<-- plot grid lines
plt.show()```

这是绘制的向量的样子,但它们不是正交的。

这里是数据(已经规范化了 np.log):

[[1.954242509439325,
  1.6901960800285136,
  1.9444826721501687,
  1.2787536009528289,
  1.7558748556724915,
  1.7075701760979363,
  1.2787536009528289,
  1.3222192947339193,
  1.4313637641589874,
  1.3222192947339193,
  1.9084850188786497,
  1.8750612633917,
  1.6434526764861874,
  1.8512583487190752,
  1.3424226808222062,
  1.9590413923210936,
  1.9294189257142926,
  1.8692317197309762,
  1.4771212547196624,
  1.414973347970818],
 [1.9138138523837167,
  1.0,
  1.7781512503836436,
  0.3010299956639812,
  1.7403626894942439,
  1.6127838567197355,
  0.47712125471966244,
  0.3010299956639812,
  0.6020599913279624,
  0.3010299956639812,
  1.8260748027008264,
  1.8512583487190752,
  0.9542425094393249,
  1.662757831681574,
  1.9030899869919435,
  1.8195439355418688,
  1.380211241711606,
  1.9731278535996986,
  0.6989700043360189,
  1.255272505103306],
 [1.9444826721501687,
  1.6232492903979006,
  1.7993405494535817,
  0.6020599913279624,
  1.8808135922807914,
  1.724275869600789,
  1.0413926851582251,
  1.3617278360175928,
  1.0413926851582251,
  0.6989700043360189,
  1.9395192526186185,
  1.9242792860618816,
  1.6020599913279623,
  1.6532125137753437,
  1.9444826721501687,
  1.9731278535996986,
  1.6720978579357175,
  1.5563025007672873,
  1.7558748556724915,
  0.47712125471966244],
 [1.9822712330395684,
  1.792391689498254,
  1.9912260756924949,
  1.505149978319906,
  1.792391689498254,
  1.8260748027008264,
  1.6334684555795864,
  0.8450980400142568,
  1.146128035678238,
  1.146128035678238,
  1.919078092376074,
  1.9493900066449128,
  1.7853298350107671,
  1.9084850188786497,
  1.1760912590556813,
  1.4913616938342726,
  1.9867717342662448,
  1.1139433523068367,
  1.724275869600789,
  1.1760912590556813],
 [1.9731278535996986,
  1.5797835966168101,
  1.6812412373755872,
  1.0413926851582251,
  1.8692317197309762,
  1.568201724066995,
  1.3617278360175928,
  0.9542425094393249,
  1.1139433523068367,
  1.0791812460476249,
  1.8808135922807914,
  1.8808135922807914,
  1.6232492903979006,
  1.7558748556724915,
  1.462397997898956,
  1.9242792860618816,
  1.9030899869919435,
  1.919078092376074,
  1.3010299956639813,
  0.6989700043360189],
 [1.9867717342662448,
  1.7853298350107671,
  1.9344984512435677,
  1.4471580313422192,
  1.8976270912904414,
  1.863322860120456,
  1.0791812460476249,
  0.8450980400142568,
  1.414973347970818,
  1.3617278360175928,
  1.9294189257142926,
  1.9731278535996986,
  1.919078092376074,
  1.3010299956639813,
  1.9590413923210936,
  1.9731278535996986,
  1.9731278535996986,
  1.9242792860618816,
  1.4913616938342726,
  1.380211241711606],
 [1.4313637641589874,
  1.9344984512435677,
  1.99563519459755,
  1.3424226808222062,
  1.9590413923210936,
  1.7403626894942439,
  1.8808135922807914,
  1.2304489213782739,
  1.3010299956639813,
  1.380211241711606,
  1.8808135922807914,
  1.8325089127062364,
  1.9493900066449128,
  1.9590413923210936,
  1.0413926851582251,
  1.9777236052888478,
  1.9731278535996986,
  1.7558748556724915,
  1.0413926851582251,
  1.4471580313422192],
 [1.8573324964312685,
  1.414973347970818,
  1.8864907251724818,
  0.3010299956639812,
  1.3424226808222062,
  1.5314789170422551,
  0.0,
  0.6989700043360189,
  1.3010299956639813,
  0.47712125471966244,
  1.3424226808222062,
  1.7075701760979363,
  0.9030899869919435,
  1.2041199826559248,
  1.9493900066449128,
  1.8129133566428555,
  1.8920946026904804,
  1.9637878273455553,
  0.7781512503836436,
  0.9542425094393249],
 [1.7403626894942439,
  1.4913616938342726,
  1.7853298350107671,
  1.1760912590556813,
  1.462397997898956,
  1.5185139398778875,
  0.0,
  0.6989700043360189,
  1.1760912590556813,
  1.0413926851582251,
  1.6901960800285136,
  1.6232492903979006,
  1.146128035678238,
  1.6127838567197355,
  1.7075701760979363,
  1.7075701760979363,
  1.8573324964312685,
  1.4471580313422192,
  1.1139433523068367,
  1.0413926851582251],
 [1.863322860120456,
  1.8573324964312685,
  1.9294189257142926,
  1.3979400086720377,
  1.4913616938342726,
  1.8388490907372552,
  1.0,
  1.2304489213782739,
  1.2787536009528289,
  1.1760912590556813,
  1.8976270912904414,
  1.845098040014257,
  1.662757831681574,
  1.7853298350107671,
  1.806179973983887,
  1.9138138523837167,
  1.6812412373755872,
  1.7853298350107671,
  1.6812412373755872,
  1.4771212547196624],
 [1.9822712330395684,
  1.2304489213782739,
  1.9637878273455553,
  1.5440680443502757,
  1.8195439355418688,
  1.505149978319906,
  1.2304489213782739,
  1.0413926851582251,
  1.7075701760979363,
  1.6232492903979006,
  1.9084850188786497,
  1.8573324964312685,
  1.6989700043360187,
  1.806179973983887,
  1.0413926851582251,
  1.9637878273455553,
  1.9590413923210936,
  1.4771212547196624,
  1.0413926851582251,
  1.5314789170422551],
 [1.9637878273455553,
  1.2304489213782739,
  1.919078092376074,
  1.1139433523068367,
  1.792391689498254,
  1.7075701760979363,
  0.6020599913279624,
  1.2304489213782739,
  1.4771212547196624,
  1.1760912590556813,
  1.7853298350107671,
  1.8573324964312685,
  1.5314789170422551,
  1.7075701760979363,
  1.0413926851582251,
  1.7993405494535817,
  1.9731278535996986,
  1.4471580313422192,
  0.3010299956639812,
  1.792391689498254],
 [1.4771212547196624,
  1.7160033436347992,
  1.99563519459755,
  1.0413926851582251,
  1.9030899869919435,
  1.8750612633917,
  1.255272505103306,
  0.3010299956639812,
  0.6989700043360189,
  0.47712125471966244,
  1.7558748556724915,
  1.7160033436347992,
  1.662757831681574,
  1.9493900066449128,
  0.6989700043360189,
  1.9867717342662448,
  1.3979400086720377,
  1.4913616938342726,
  0.47712125471966244,
  0.9542425094393249]]

df = pd.DataFrame(data, columns=['Real coffee', 'Instant coffee', 'Tea', 'Sweetener', 'Biscuits',
       'Powder soup', 'Tin soup', 'Potatoes', 'Frozen fish', 'Frozen veggies',
       'Apples', 'Oranges', 'Tinned fruit', 'Jam', 'Garlic', 'Butter',
       'Margarine', 'Olive oil', 'Yoghurt', 'Crisp bread'])

【问题讨论】:

  • 特征向量没有任何约束,即它们必须是正交的。相关矩阵的特征向量应该是正交的。很难按照你的排序,你为什么不使用np.dot(vectors[:, col_i], vectors[:, col_j])检查所有vectors对的正交性。如果它们是正交的,则对于所有 i 和 j(i==j 除外),此点积应为 0。
  • 考虑改为这样排序:order = np.argsort(values), matrix_w = vectors[:, order]
  • 还有vectors的形状是什么?除非它是 2×2,否则看起来你已经剪裁了向量,所以它们当然不再是正交的,你只是将它们从(我假设)20D 投影到 2D
  • @Dan 向量的形状是 (20,20)。我不明白如何使用 np.dot 检查正交性 - 我需要做一个循环吗?我可以做类似 matrix_w.dot(matrix_w.T) 的事情吗
  • 你可以使用循环。否则,我认为vectors @ vectors.T 可能会有效地对每对进行成对点积(只需看下面的三角形)。您的正交性在 20D 中,当您投影到 2D 时,它没有理由保持正交。想想当您将 3D 轴投影到 2D 时会发生什么(就像您见过的每个 3D 图表一样),z 轴不再与 x 或 y 正交。这基本上就是你正在做的事情。

标签: python numpy pca eigenvector orthogonal


【解决方案1】:

检查两个向量的正交性的简单方法是查看点积是否为零。在您的情况下,正交向量应该是 vectors 的列(即协方差矩阵的特征向量)。例如以下应该运行而不会引发错误

n, m = vectors.shape
for col_i in range(m):
    for col_j in range(m):
        if col_i < col_j:  # use strictly less than because we don't want to consider a column with itself, and the dot product is commutable so order doesn't matter
            is_orthogonal = np.dot(vectors[:, col_i], vectors[:, col_j])
            if not np.isclose(is_orthogonal, 0):
                raise ValueError(f"Eigenvector {col_i} and Eigenvector {col_j} are not orthogonal.")

一种更快的方法是记住矩阵积只是第一个矩阵的行与第二个矩阵的列的点积,即vectors.T @ vectors。然后我们要检查这个结果的下三角形,不包括对角线(与循环中的if col_i &lt; col_j 相同的原因)都为零:

np.all(np.isclose(np.tril(vec.T @ vec, -1), 0))

这应该返回True

您的绘图看起来不正交的原因是您获取了两个 20D 向量并将它们任意投影到 2D。当您这样做时,不能保证它们将保持正交。例如,考虑常见的 xyz 轴图:

您知道 z 轴与 x 轴正交,但如果您将其投影到 2D,那么您看到的角度取决于投影角度,不再正交。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-08-15
    • 1970-01-01
    • 2014-05-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多