Principal Component Analysis (PCA)

PCA is a method for dimensionality reduction. It tries to transform a multidimensional dataset in such a way that one dimension represents as much of the variance in the data as possible. This allows you to represent data by a few representative dimensions, the principal components.

Each principal component is a vector mapping the original dimensions to the transformed dimensions. PCA can calculate multiple principal components to give you a more accurate description of the data.

Example:

Imagine you have data on a number of ships. Each ship has a number of descriptors: length, width, height, BRT, depth, number of crew members, engine strength etc. PCA allows you to summarize all these numbers to one number. With ships, it will roughly correspond to the size of the ships, because most of the other numbers grow with bigger ship size.

PCA is useful if you want to:

visualize multidimensional data
simplify a complex dataset
get an impression what the shape of your data is
speed up calculations

PCA is usually quite useless if your data has only 2-3 dimensions.

Code

data(iris)

iris.log <- log(iris[, 1:4])

iris.species <- iris[, 5]

iris.pca <- prcomp(iris.log, center = TRUE, scale. = TRUE)

print(iris.pca)

plot(iris.pca, type = "l")

summary(iris.pca)

Also see:

https://tgmstat.wordpress.com/2013/11/28/computing-and-visualizing-pca-in-r/

PCA

Principal Component Analysis (PCA)

Example:

PCA is useful if you want to:

Code

Also see:

results matching ""

No results matching ""