# Principal Component Analysis - Part III

In this post, we will reproduce the results of a popular paper on PCA. The paper is titled ‘Principal component analysis’ and is authored by Herve Abdi and Lynne J. Williams. It got published in 2010 and since then its popularity has only grown. Its number of citations are more than 4800 as per Google Scholar data (This was the number when this post was last revised).

This post is Part-III of a three part series on PCA. Other parts of the series can be found at the links below.

This post contains code snippets in R. Equivalent MATLAB codes can be written using commands of Part-II. For figures, the reader has to write his/her own code in MATLAB.

### Structure of the paper

Along with basic theory, the paper contains three examples on PCA, one example on correspondence analysis, and one example on multiple factor analysis. We will only focus on PCA examples in this post.

To run following R codes seamlessly, readers have to load following packages. If these packages have not been installed previously, use install.packages("package_name") to install those.

library(ggplot2)
library(ggrepel)

### How to get data

Data for the examples have been taken from the paper [1]. The datasets are pretty small. So one way to read the data is to create a dataframe itself in R using the values given in paper. Otherwise, the values can first be stored in a csv file and then read into R. To make this post self-sufficient, we will adopt the former approach.

# Table 1
# Create a dataframe
Words = c("Bag", "Across", "On", "Insane", "By", "Monastery", "Relief", "Slope", "Scoundrel", "With", "Neither", "Pretentious", "Solid", "This", "For", "Therefore", "Generality", "Arise", "Blot", "Infectious")
Word_length = c(3, 6, 2, 6, 2, 9, 6, 5, 9, 4, 7, 11, 5, 4, 3, 9, 10, 5, 4, 10)
Lines_in_dict = c(14, 7, 11, 9, 9, 4, 8, 11, 5, 8, 2, 4, 12, 9, 8, 1, 4, 13, 15, 6)
words = data.frame(Words, Word_length, Lines_in_dict, stringsAsFactors = F)
words
         Words Word_length Lines_in_dict
1          Bag           3            14
2       Across           6             7
3           On           2            11
4       Insane           6             9
5           By           2             9
6    Monastery           9             4
7       Relief           6             8
8        Slope           5            11
9    Scoundrel           9             5
10        With           4             8
11     Neither           7             2
12 Pretentious          11             4
13       Solid           5            12
14        This           4             9
15         For           3             8
16   Therefore           9             1
17  Generality          10             4
18       Arise           5            13
19        Blot           4            15
20  Infectious          10             6
(words_centered = scale(words[,2:3],scale = F)) # Centering after reemoving the first column
      Word_length Lines_in_dict
[1,]          -3             6
[2,]           0            -1
[3,]          -4             3
[4,]           0             1
[5,]          -4             1
[6,]           3            -4
[7,]           0             0
[8,]          -1             3
[9,]           3            -3
[10,]          -2             0
[11,]           1            -6
[12,]           5            -4
[13,]          -1             4
[14,]          -2             1
[15,]          -3             0
[16,]           3            -7
[17,]           4            -4
[18,]          -1             5
[19,]          -2             7
[20,]           4            -2
attr(,"scaled:center")
Word_length Lines_in_dict
6             8 

### Covariance PCA

Covariance PCA uses centered data matrix. But data matrix is not scaled. prcomp() centers data by default.

pca_words_cov = prcomp(words[,2:3],scale = F) # cov stands for Covariance PCA
factor_scores_words = pca_words_cov$x round(factor_scores_words,2)  PC1 PC2 [1,] -6.67 0.69 [2,] 0.84 -0.54 [3,] -4.68 -1.76 [4,] -0.84 0.54 [5,] -2.99 -2.84 [6,] 4.99 0.38 [7,] 0.00 0.00 [8,] -3.07 0.77 [9,] 4.14 0.92 [10,] -1.07 -1.69 [11,] 5.60 -2.38 [12,] 6.06 2.07 [13,] -3.91 1.30 [14,] -1.92 -1.15 [15,] -1.61 -2.53 [16,] 7.52 -1.23 [17,] 5.52 1.23 [18,] -4.76 1.84 [19,] -6.98 2.07 [20,] 3.83 2.30 Observer that factor scores for PC1 are negatives of what has been given in the paper. This is not a problem as principal directions are orthogonal. ### Principal directions are orthogonal # It can also be checked that both the principal components are orthogonal. sum(factor_scores_words[,1]*factor_scores_words[,2]) # PCs are orthogonal [1] -4.773959e-15 ### Contribution of each factor It is defined as square of factor score divided by sum of squares of factor scores in that column. round(factor_scores_words[,1]^2/sum(factor_scores_words[,1]^2)*100,2)  [1] 11.36 0.18 5.58 0.18 2.28 6.34 0.00 2.40 4.38 0.29 8.00 9.37 [13] 3.90 0.94 0.66 14.41 7.78 5.77 12.43 3.75 round(factor_scores_words[,2]^2/sum(factor_scores_words[,2]^2)*100,2)  [1] 0.92 0.55 5.98 0.55 15.49 0.28 0.00 1.13 1.63 5.48 10.87 8.25 [13] 3.27 2.55 12.32 2.90 2.90 6.52 8.25 10.18 The calculations in above two lines can be done in a single line round(factor_scores_words^2/matrix(rep(colSums(factor_scores_words^2),nrow(words)),ncol = 2,byrow = T)*100,2)  PC1 PC2 [1,] 11.36 0.92 [2,] 0.18 0.55 [3,] 5.58 5.98 [4,] 0.18 0.55 [5,] 2.28 15.49 [6,] 6.34 0.28 [7,] 0.00 0.00 [8,] 2.40 1.13 [9,] 4.38 1.63 [10,] 0.29 5.48 [11,] 8.00 10.87 [12,] 9.37 8.25 [13,] 3.90 3.27 [14,] 0.94 2.55 [15,] 0.66 12.32 [16,] 14.41 2.90 [17,] 7.78 2.90 [18,] 5.77 6.52 [19,] 12.43 8.25 [20,] 3.75 10.18 ### Squared distance to center of gravity (dist = rowSums(factor_scores_words^2))  [1] 45 1 25 1 17 25 0 10 18 4 37 41 17 5 9 58 32 26 53 20 ### Squared cosine of observations (sq_cos = round(factor_scores_words^2/rowSums(factor_scores_words^2)*100))  PC1 PC2 [1,] 99 1 [2,] 71 29 [3,] 88 12 [4,] 71 29 [5,] 53 47 [6,] 99 1 [7,] NaN NaN [8,] 94 6 [9,] 95 5 [10,] 29 71 [11,] 85 15 [12,] 90 10 [13,] 90 10 [14,] 74 26 [15,] 29 71 [16,] 97 3 [17,] 95 5 [18,] 87 13 [19,] 92 8 [20,] 74 26 Nan’s are produced because of division by zero. # Figue 1 p = ggplot(words,aes(x = Lines_in_dict,y = Word_length,label = Words))+ geom_point()+ geom_text_repel()+ geom_hline(yintercept = 6)+geom_vline(xintercept = 8)+ labs(x = "Lines in dictionary",y = "Word length") print(p) # Show directions of PCs # Note that intercept argument in geom_abline considers the line to be at the origin. In our case the data are mean shifted. # So we have to adjust the intercept taking new origin into consideration. These adjustments have been made below. slope1 = pca_words_cov$rotation[1,1]/pca_words_cov$rotation[2,1] # Slope of first PC slope2 = pca_words_cov$rotation[1,2]/pca_words_cov$rotation[2,2] # Slope of second PC (new_origin = c(mean(words$Lines_in_dict),mean(words$Word_length))) [1] 8 6 intercept1 = 6 - slope1*8 intercept2 = 6 - slope2*8 p+geom_abline(slope = slope1,intercept = intercept1,linetype = "dashed",size = 1.2,col = "red")+ geom_abline(slope = slope2,intercept = intercept2,linetype = "dashed",size = 1.2,col = "blue") In the above figure red dashed line is the 1st principal component (PC) and blue dashed line is the 2nd PC. ### Rotated PCs This figure is obtained by plotting factor scores. Note that we will plot negative of the factor scores of 1st PC to make the figure consistent with the paper. ggplot(as.data.frame(pca_words_cov$x),aes(-pca_words_cov$x[,1],pca_words_cov$x[,2],label = words$Words))+ geom_point()+geom_text_repel()+geom_hline(yintercept = 0)+geom_vline(xintercept = 0)+ xlab("Factor score along PC1")+ylab("Factor score along PC2") ### With supplementary data Given a supplementary point (a point previously not used in finding principal components),we have to first center the data point. Its factor scores can then be obtained by multiplying it with the loading matrix. ### Factor score of the new word ‘sur’ sur = c(3,12) # It has 3 letter and 12 lines of dictionary entry (sur_centered = sur - colMeans(words[,2:3]))  Word_length Lines_in_dict -3 4  (factor_scores_sur = round(sur_centered %*% pca_words_cov$rotation,2))
       PC1   PC2
[1,] -4.99 -0.38

### Eigenvalues and variance

See Part-II for details.

### Total variance before transformation

(total_var_before = round(sum(diag(var(words_centered))),3))
[1] 23.368

(total_var_after = round(sum(diag(var(pca_words_cov$x))),3)) [1] 23.368 Correlation between principal components and original variables (In the paper,this correlation is also termed loading. But we will strictly reserve the loading term to mean loading matrix $$\textbf{P}$$ (see Part-I) The sum of correlation coefficients between variables and principal components is 1. Intuitively, this means that variables are orthogonally projected onto the principal components. ### Correlation matrix # Correlation between PCs and original variables (cor(pca_words_cov$x,words_centered))
    Word_length Lines_in_dict
PC1   0.8679026    -0.9741764
PC2   0.4967344     0.2257884

Note that the answers for correlation coefficients don’t match with that of the paper. Readers who get actual answers as given in paper are encouraged to send me an email using my contact details. However our procedure is correct and it does indeed give the correct answer for supplementary data as described below.

### Squared correlation

(cor(pca_words_cov$x,words_centered)^2)  Word_length Lines_in_dict PC1 0.7532549 0.94901961 PC2 0.2467451 0.05098039 Sum of correlation coefficients between variables and principal components is 1. colSums((cor(pca_words_cov$x,words_centered)^2))
  Word_length Lines_in_dict
1             1 

(loading_matrix = pca_words_cov$rotation)  PC1 PC2 Word_length 0.5368755 0.8436615 Lines_in_dict -0.8436615 0.5368755 ### Correlation score for supplementary variables # Supplementary variable (Table 4) Frequency = c(8,230,700,1,500,1,9,2,1,700,7,1,4,500,900,3,1,10,1,1) Num_entries = c(6,3,12,2,7,1,1,6,1,5,2,1,5,9,7,1,1,4,4,2) supp_data = data.frame(Frequency,Num_entries) # Supplementary data supp_data  Frequency Num_entries 1 8 6 2 230 3 3 700 12 4 1 2 5 500 7 6 1 1 7 9 1 8 2 6 9 1 1 10 700 5 11 7 2 12 1 1 13 4 5 14 500 9 15 900 7 16 3 1 17 1 1 18 10 4 19 1 4 20 1 2 ### Centered supplementary data supp_data_cent = scale(supp_data,scale = F) # Centered supplementary data ### Correlation score for supplementary data (corr_score_supp = round(cor(pca_words_cov$x,supp_data),4))
    Frequency Num_entries
PC1   -0.3012     -0.6999
PC2   -0.7218     -0.4493

Note that correlation score doesn’t depend on whether supplementary data is centered or not.

(round(cor(pca_words_cov$x,supp_data_cent),4))  Frequency Num_entries PC1 -0.3012 -0.6999 PC2 -0.7218 -0.4493 ### Squared correlation (round(cor(pca_words_cov$x,supp_data_cent)^2,4))
    Frequency Num_entries
PC1    0.0907      0.4899
PC2    0.5210      0.2019

(round(colSums(cor(pca_words_cov$x,supp_data_cent)^2),4))  Frequency Num_entries 0.6118 0.6918  ### Correlation circle plot # First plot correlation circle x = seq(0,2*pi,length.out = 300) circle = ggplot() + geom_path(data = data.frame(a = cos(x),b = sin(x)), aes(cos(x),sin(x)),alpha = 0.3, size = 1.5)+ geom_hline(yintercept = 0)+geom_vline(xintercept = 0)+ annotate("text",x = c(1.08,0.05),y = c(0.05,1.08),label = c("PC1","PC2"),angle = c(0,90))+ xlab(NULL)+ylab(NULL) # Plotting original variables cor_score = as.data.frame(cor(words_centered,pca_words_cov$x))
variable_plot_original = circle + geom_point(data = cor_score,  aes(cor_score[,1],cor_score[,2]))+
geom_text_repel(aes(cor_score[,1],cor_score[,2],
label = c("Length of words","Number of lines in Dict.")))
print(variable_plot_original)

### Plotting supplementary variables

variable_plot_original+
geom_point(data = as.data.frame(corr_score_supp),
aes(corr_score_supp[,1],corr_score_supp[,2]))+
geom_text_repel(aes(corr_score_supp[,1],corr_score_supp[,2],
label = c("Frequency","Number of entries"))) 

Observe that our correlation circle plot is flipped about y-axis (i.e., PC2) when compared to the plot given in paper. This is because our first principal component is negative of the one given in paper. So while computing correlation score, this negative principal component results in negative correlation scores. Hence, our plot flips about y-axis.

## Example 2 (Wine example)

### Correlation PCA with wine data

# Table 6
wine_type = c(paste("wine", 1:5, sep = "_"))
hedonic = c(14, 10, 8, 2, 6)
for_meat = c(7, 7, 5, 4, 2)
for_dessert = c(8, 6, 5, 7, 4)
price = c(7, 4, 10, 16, 13)
sugar = c(7, 3, 5, 7, 3)
alcohol = c(13, 14, 12, 11, 10)
acidity = c(7, 7, 5, 3, 3)
wine = data.frame(wine_type, hedonic, for_meat, for_dessert, price, sugar, alcohol, acidity, stringsAsFactors = F)
wine
  wine_type hedonic for_meat for_dessert price sugar alcohol acidity
1    wine_1      14        7           8     7     7      13       7
2    wine_2      10        7           6     4     3      14       7
3    wine_3       8        5           5    10     5      12       5
4    wine_4       2        4           7    16     7      11       3
5    wine_5       6        2           4    13     3      10       3
pca_wine_cor = prcomp(wine[2:8],scale = T)
ggplot(as.data.frame(pca_wine_cor$x),aes(x = pca_wine_cor$x[,1],y =  pca_wine_cor$x[,2], label = paste0("wine ",1:5)))+ geom_point()+geom_text_repel()+ geom_vline(xintercept = 0)+ geom_hline(yintercept = 0)+ xlab("Factor score along PC1")+ylab("Factor score along PC2") Again our figure seems upside down than that of the paper. This is a minor discrepancy. Our 2nd eigenvector is negative of the one considered in paper. We can match the plot with that of the paper by just flipping the second principal component but we will not do that here. ### Factor scores along 1st and 2nd PC # Table 7 (pca_wine_cor$x[,1:2])
            PC1       PC2
[1,] -2.3301649  1.095284
[2,] -2.0842419 -1.223185
[3,]  0.1673228 -0.370258
[4,]  1.7842392  1.712563
[5,]  2.4628448 -1.214405

### Contribution of each observation to principal component

round(pca_wine_cor$x[,1:2]^2/matrix(rep(colSums(pca_wine_cor$x[,1:2]^2),nrow(wine)),ncol = 2,byrow = T)*100,2)
       PC1   PC2
[1,] 28.50 16.57
[2,] 22.80 20.66
[3,]  0.15  1.89
[4,] 16.71 40.51
[5,] 31.84 20.37

### Squared cosine of observations of first PC

(sq_cos = round(pca_wine_cor$x[,1:2]^2/rowSums(pca_wine_cor$x^2)*100))
     PC1 PC2
[1,]  77  17
[2,]  69  24
[3,]   7  34
[4,]  50  46
[5,]  78  19

(round(pca_wine_cor$rotation[,1:2],2))  PC1 PC2 hedonic -0.40 -0.11 for_meat -0.45 0.11 for_dessert -0.26 0.59 price 0.42 0.31 sugar -0.05 0.72 alcohol -0.44 -0.06 acidity -0.45 -0.09 ### Correlation score variables with first two principal components (corr_score_wine = round(cor(pca_wine_cor$x,wine[,2:8])[1:2,],2))
    hedonic for_meat for_dessert price sugar alcohol acidity
PC1   -0.87    -0.97       -0.58  0.91 -0.11   -0.96   -0.99
PC2   -0.15     0.15        0.79  0.42  0.97   -0.07   -0.12

### Correlation circle for wine data

# Figure 6
corr_score_wine = t(corr_score_wine)
circle +
geom_point(data = as.data.frame(corr_score_wine),
aes(corr_score_wine[,1],corr_score_wine[,2]))+
geom_text_repel(aes(corr_score_wine[,1],corr_score_wine[,2],
label = c("Hedonic","For Meat","For dessert","Price","Sugar","Alcohol","Acidity")))

### Varimax rotation

Rotation is applied to loading matrix such that after rotation principal components are interpretable. By interpretable, we mean, some of the loading scores will have higher values and some other loading scores will have lower values. So it can be said that the variables whose loading scores have higher value, contribute significantly towards principal components as compared to other variables with lesser loading scores. Though rotation works in certain cases, it must be remembered that it is no magic wand for principal component interpretability. One of the popular rotations is Varimax rotation. R has a built-in command to perform varimax rotation.

Varimax rotation can be performed on the whole loading matrix or on a few components only. In the paper, varimax has been applied to first two principal components.

(round(pca_wine_cor$rotation[,1:2],2))  PC1 PC2 hedonic -0.40 -0.11 for_meat -0.45 0.11 for_dessert -0.26 0.59 price 0.42 0.31 sugar -0.05 0.72 alcohol -0.44 -0.06 acidity -0.45 -0.09 ### Varimax applied to first two principal components rotated_loading_scores = varimax(pca_wine_cor$rotation[,1:2])

# Table 10
(round(rotated_loading_scores$loadings[,1:2],2))  PC1 PC2 hedonic -0.41 -0.02 for_meat -0.41 0.21 for_dessert -0.12 0.63 price 0.48 0.21 sugar 0.12 0.72 alcohol -0.44 0.05 acidity -0.46 0.02 The same result can also be obtained by multiplying the original loading matrix by the rotation matrix obtained from varimax. (round(pca_wine_cor$rotation[,1:2] %*% rotated_loading_scores$rotmat,2))  [,1] [,2] hedonic -0.41 -0.02 for_meat -0.41 0.21 for_dessert -0.12 0.63 price 0.48 0.21 sugar 0.12 0.72 alcohol -0.44 0.05 acidity -0.46 0.02 ### Plot of loading scores before rotation #Figure 7 ggplot(as.data.frame(pca_wine_cor$rotation[,1:2]),aes(x = pca_wine_cor$rotation[,1],y = pca_wine_cor$rotation[,2],
label = c("Hedonic","For Meat","For dessert","Price","Sugar","Alcohol","Acidity")))+
geom_point()+geom_text_repel()+geom_hline(yintercept = 0)+geom_vline(xintercept = 0)+
xlab("Loading score along PC1")+ylab("Loading score along PC2")

ggplot(as.data.frame(rotated_loading_scores$loadings[,1:2]), aes(x = rotated_loading_scores$loadings[,1],
y = rotated_loading_scores$loadings[,2], label = c("Hedonic","For Meat","For dessert","Price","Sugar","Alcohol","Acidity")))+ geom_point()+geom_text_repel()+geom_hline(yintercept = 0)+geom_vline(xintercept = 0)+ xlab("Loading score along PC1 after rotation")+ ylab("Loading score along PC2 after rotation") ## Example 3 ### French food example (Covariance PCA example) # Table 11 class = rep(c("Blue_collar", "White_collar", "Upper_class"), times = 4) children = rep(c(2,3,4,5), each = 3) bread = c(332, 293, 372, 406, 386, 438, 534, 460, 385, 655, 584, 515) vegetables = c(428, 559, 767, 563, 608, 843, 660, 699, 789, 776, 995, 1097) fruit = c(354, 388, 562, 341, 396, 689, 367, 484, 621, 423, 548, 887) meat = c(1437, 1527, 1948, 1507, 1501, 2345, 1620, 1856, 2366, 1848, 2056, 2630) poultry = c(526, 567, 927, 544, 558, 1148, 638, 762, 1149, 759, 893, 1167) milk = c(247, 239, 235, 324, 319, 243, 414, 400, 304, 495, 518, 561) wine = c(427, 258, 433, 407, 363, 341, 407, 416, 282, 486, 319, 284) food = data.frame(class, children, bread, vegetables, fruit, meat, poultry, milk, wine, stringsAsFactors = F) food  class children bread vegetables fruit meat poultry milk wine 1 Blue_collar 2 332 428 354 1437 526 247 427 2 White_collar 2 293 559 388 1527 567 239 258 3 Upper_class 2 372 767 562 1948 927 235 433 4 Blue_collar 3 406 563 341 1507 544 324 407 5 White_collar 3 386 608 396 1501 558 319 363 6 Upper_class 3 438 843 689 2345 1148 243 341 7 Blue_collar 4 534 660 367 1620 638 414 407 8 White_collar 4 460 699 484 1856 762 400 416 9 Upper_class 4 385 789 621 2366 1149 304 282 10 Blue_collar 5 655 776 423 1848 759 495 486 11 White_collar 5 584 995 548 2056 893 518 319 12 Upper_class 5 515 1097 887 2630 1167 561 284 pca_food_cov = prcomp(food[,3:9],scale = F) ### Factor scores # Table 12 (factor_scores_food = round(pca_food_cov$x[,1:2],2))
          PC1     PC2
[1,] -635.05  120.89
[2,] -488.56  142.33
[3,]  112.03  139.75
[4,] -520.01  -12.05
[5,] -485.94   -1.17
[6,]  588.17  188.44
[7,] -333.95 -144.54
[8,]  -57.51  -42.86
[9,]  571.32  206.76
[10,]  -39.38 -264.47
[11,]  296.04 -235.92
[12,]  992.83  -97.15

### Contribution of each observation to principal component

round(pca_food_cov$x[,1:2]^2/matrix(rep(colSums(pca_food_cov$x[,1:2]^2),nrow(food)),ncol = 2,byrow = T)*100,2)
        PC1   PC2
[1,] 13.34  5.03
[2,]  7.90  6.97
[3,]  0.42  6.72
[4,]  8.94  0.05
[5,]  7.81  0.00
[6,] 11.44 12.22
[7,]  3.69  7.19
[8,]  0.11  0.63
[9,] 10.80 14.71
[10,]  0.05 24.07
[11,]  2.90 19.15
[12,] 32.61  3.25
dist = pca_food_cov$x[,1]^2+pca_food_cov$x[,2]^2

### Squared cosine of observations

(sq_cos = round(pca_food_cov$x[,1:2]^2/rowSums(pca_food_cov$x^2)*100))
      PC1 PC2
[1,]  95   3
[2,]  86   7
[3,]  26  40
[4,] 100   0
[5,]  98   0
[6,]  89   9
[7,]  83  15
[8,]  40  22
[9,]  86  11
[10,]   2  79
[11,]  57  36
[12,]  97   1

# Table 13
(round(pca_food_cov$rotation[,1:2]^2,2))  PC1 PC2 bread 0.01 0.33 vegetables 0.11 0.17 fruit 0.09 0.01 meat 0.57 0.01 poultry 0.22 0.06 milk 0.01 0.40 wine 0.00 0.02 Note that this table doesn’t match with that of the paper. We will stick to our analysis. ### Correlation score (corr_score_food = round((cor(pca_food_cov$x,food[,3:9])[1:2,]),2))
    bread vegetables fruit meat poultry  milk  wine
PC1  0.36       0.91  0.96 1.00    0.98  0.41 -0.43
PC2 -0.87      -0.35  0.10 0.04    0.16 -0.88 -0.33

(round((cor(pca_food_cov$x,food[,3:9])[1:2,])^2,2))  bread vegetables fruit meat poultry milk wine PC1 0.13 0.83 0.92 1 0.96 0.17 0.18 PC2 0.76 0.12 0.01 0 0.03 0.77 0.11 ### Correlation circle for food data # Figure 9 corr_score_food = t(corr_score_food) circle + geom_point(data = as.data.frame(corr_score_food), aes(x = corr_score_food[,1],y = corr_score_food[,2]))+ geom_text_repel(data = as.data.frame(corr_score_food), aes(x = corr_score_food[,1],y = corr_score_food[,2], label = c("Bread","Vegetables","Fruit","Meat","Poultry","Milk","Wine"))) Now observe that our correlation circle plot is almost close to that of the papers (though in opposite quadrants. But this is not a problem as we have previously mentioned). ### Eigenvalues Eigenvalues of data covariance matrix is square of singular values of centered data matrix. Hence eigenvalues of data covariance matrix can be obtained as below. ## Table 14 cent_food = food[,3:9]-matrix(rep(colMeans(food[,3:9]),times = 12),nrow = 12, byrow = T) svd_food = svd(cent_food) (Eigenvalues = (svd_food$d)^2)
[1] 3023141.2354  290575.8390   68795.2333   25298.9496   22992.2474
[6]    3722.3214     723.9238

Important Note: These eigenvalues are not the same as variance of factor scores in principal components. Variance of principal component factor scores can be obtained by dividing the eigenvalues by $$(n-1)$$, where $$n$$ is number of data points (in this case $$n = 12$$). If this point is still not clear, refer to Part-II.

### Percentage contribution of each principal component

(round(Eigenvalues/sum(Eigenvalues),2))
[1] 0.88 0.08 0.02 0.01 0.01 0.00 0.00

### Cumulative sum of eigenvalues

(round(cumsum(Eigenvalues),2))
[1] 3023141 3313717 3382512 3407811 3430804 3434526 3435250

### Cumulative percentage contribution

(round(cumsum(Eigenvalues)/sum(Eigenvalues),2))
[1] 0.88 0.96 0.98 0.99 1.00 1.00 1.00

### RESS (Refer to the paper for a description)

RESS = array(rep(0,7))
for (i in 1:7){
RESS[i] = sum(Eigenvalues)-sum(Eigenvalues[1:i])
}
RESS
[1] 412108.5146 121532.6756  52737.4423  27438.4927   4446.2453    723.9238
[7]      0.0000

### Ratio of RESS and sum of eigenvalues

round(RESS/sum(Eigenvalues),2)
[1] 0.12 0.04 0.02 0.01 0.00 0.00 0.00

We will not calculate the value of PRESS in this post as it requires us to consider random models. We will not pursue that here.

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggrepel_0.9.1 ggplot2_3.3.5

loaded via a namespace (and not attached):
[1] Rcpp_1.0.7       highr_0.9        bslib_0.3.1      compiler_4.1.2
[5] pillar_1.6.3     jquerylib_0.1.4  tools_4.1.2      digest_0.6.28
[9] jsonlite_1.7.2   evaluate_0.14    lifecycle_1.0.1  tibble_3.1.2
[13] gtable_0.3.0     pkgconfig_2.0.3  rlang_0.4.11     DBI_1.1.1
[17] yaml_2.2.1       blogdown_1.7     xfun_0.29        fastmap_1.1.0
[21] withr_2.4.2      stringr_1.4.0    dplyr_1.0.6      knitr_1.36
[25] generics_0.1.0   sass_0.4.0       vctrs_0.3.8      tidyselect_1.1.1
[29] grid_4.1.2       glue_1.4.2       R6_2.5.1         fansi_0.4.2
[33] rmarkdown_2.8    bookdown_0.22    farver_2.1.0     purrr_0.3.4
[37] magrittr_2.0.1   scales_1.1.1     htmltools_0.5.2  ellipsis_0.3.2
[41] assertthat_0.2.1 colorspace_2.0-1 labeling_0.4.2   utf8_1.2.1
[45] stringi_1.6.2    munsell_0.5.0    crayon_1.4.1    

Though unusually long, I hope, this post will be of help to (courageous) readers who work there way through it till end. Comments regarding any errors or omissions may be sent to author’s email.

## Reference

1. Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459.

Last updated: 19th January, 2020

##### Biswajit Sahoo
###### ML Engineer

My research interests include machine learning, deep learning, signal processing and data-driven machinery condition monitoring.