36 FOLLOWERS
Last asked: 30 Sep, 2014
QUESTION STATS
Views7,318
Followers36
Edits
What is an intuitive explanation of the relation between PCA and SVD?
3 Answers
The result of this process is a ranked list of "directions" in the feature space ordered from most variance to least. The directions along which there is greatest variance are referred to as the "principal components" (of variation in the data) and the common wisdom is that by focusing on the way the data is distributed along these dimensions exclusively, one can capture most of the information represented in in the original feature space without having to deal with such a high number of dimensions which can be of great benefit in statistical modeling and Data Science applications (see: When and where do we use SVD?).
What is the Formal Relation between SVD and PCA?
Let's let the matrix T, where
PCA: PCA sidesteps the problem of T so...
where M can lead to numerical rounding errors when calculating the eigenvalues/vectors.
The result of this process is a ranked list of "directions" in the feature space ordered from most variance to least. The directions along which there is greatest variance are referred to as the "principal components" (of variation in the data) and the common wisdom is that by focusing on the way the data is distributed along these dimensions exclusively, one can capture most of the information represented in in the original feature space without having to deal with such a high number of dimensions which can be of great benefit in statistical modeling and Data Science applications (see: When and where do we use SVD?).
What is the Formal Relation between SVD and PCA?
Let's let the matrix T, where
-
M.
-
M.
- And, M.
- Note, M if they satisfy the following equations:
- ⃗ and
- ⃗
PCA: PCA sidesteps the problem of T so...
- )
- )
- but since I
-
T
where M can lead to numerical rounding errors when calculating the eigenvalues/vectors.
David Beniaguev
486 Views
Tigran Ishkhanov
1.3k Views
PCA is a statistical technique in which SVD is used as a low level linear algebra algorithm. One can apply SVD to any matrix C. In PCA this matrix C arises from the data and has a statistical meaning - the element c_ij is a covariance between i-th and j-th coordinates of your dataset after mean-normalization.