from Hacker News

How are PCA and SVD related?

by celerity on 8/23/17, 12:20 PM with 15 comments

  • by twelfthnight on 8/24/17, 2:21 AM

    For those looking for a more succinct answer: https://stats.stackexchange.com/questions/134282/relationshi...

    And here is another interesting connection between PCA and ridge regression: https://stats.stackexchange.com/questions/81395/relationship...

  • by vcdimension on 8/24/17, 12:40 PM

    I don't understand why people create these webpages just re-explaining stuff that can be read in a book, lecture notes (usually available freely online), or wikipedia. It just adds more noise to the internet. Is it a kind of marketing thing to show their customers that they know what they are doing?
  • by gabrielgoh on 8/24/17, 2:27 AM

    6 word answer

    PCA is the SVD of A'A

  • by thanatropism on 8/24/17, 11:44 AM

    PCA is a statistical model -- the simplest factor model there is. It deals with variances and covariances in datasets. It returns a transformed dataset that's linearly related to the original one but has the first variable with the highest variance and so on.

    SVD is a matrix decomposition. It generalizes the idea of representing a linear transformation (with same dimensions in domain and codomain) in the basis of its eigenvalues, which gives a diagonal matrix representation and a formula like A = V'DV.

    SVD is like this, but for rectangular matrices. So you have two matrices to diagonalize: A = U'DV.

    That SVD even performs PCA as noted in the algorithms is a theorem, albeit simple one usually given as an exercise. But hey, even OLS regression can be programmed with SVD if you want to.

  • by kiernanmcgowan on 8/24/17, 1:01 AM

    I've always understood PCA as SVD on a whitened matrix. Is this too simplistic of a view to take wrt implementation?

    https://en.m.wikipedia.org/wiki/Whitening_transformation

  • by popcorncolonel on 8/24/17, 12:41 AM

    The connection between these two has always been hazy to me. I often mixed up the two when talking about each of them independently.

    This article was well-written, exactly precise enough, and cleared up the confusion. Thanks for sharing!

  • by eggie5 on 8/24/17, 9:15 AM

    SVD is the decomposition of a matrix into its components.

    PCA is the analysis of a set of eigenvectors. Eigenvectors can come from SVD components or a covariance matrix.

    source: http://www.eggie5.com/107-svd-and-pca

  • by foxh0und on 8/24/17, 12:54 AM

    Great article, the lecture comparing the two from John Hopkins, part of the Data Science Specialization on Coursera also offers a great explanation.
  • by finknotal on 8/24/17, 7:23 AM

    "Because vectors are typically written horizontally, we transpose the vectors to write them vertically". Is there a typo in this sentence or is to just too early in the morning for me to read this?