Question: Get top-contribution variant name for PCA result
1
gravatar for luongthang1908
7 weeks ago by
luongthang190850 wrote:

Hi everyone,

I used the factoextra package to make the biplot with top=20 contributed vars. The code is:

fviz_pca_biplot(pca_result1, title = "", col.var = "steelblue", select.var = list(contrib = 20))

The result is very nice for me. However, I have a problem that I need the name of those 20 vars in a list format that I can use later. Do you know any way to do that, except writing down the name of top=20 vars from the biplot graph?

Thank you,

R • 239 views
ADD COMMENTlink modified 13 days ago • written 7 weeks ago by luongthang190850

Perhaps save the plot object and then run str() to see if these top 20 are stored in any specific part of the plot object:

myplot <- fviz_pca_biplot(pca_result1, title = "", col.var = "steelblue", select.var = list(contrib = 20))
str(mysplot)

...or, just check the code of the fviz_pca_biplot function to see how it defines these top 20, and then run the code yourself.

ADD REPLYlink written 7 weeks ago by Kevin Blighe71k

Thank you so much for your help. I will try the fviz_pca_biplot function to see how I can do the calculation. Thanks.

ADD REPLYlink written 6 weeks ago by luongthang190850

Hello, guillermo.luque.ds

ADD REPLYlink modified 13 days ago • written 13 days ago by luongthang190850
1

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer, if they all work.
Upvote|Bookmark|Accept

ADD REPLYlink modified 13 days ago • written 13 days ago by GenoMax96k

Thanks, I did upvote and accept the answer.

ADD REPLYlink written 11 days ago by luongthang190850
3
gravatar for guillermo.luque.ds
25 days ago by
Germany
guillermo.luque.ds40 wrote:

Hi @luongthang1908, following what is stated in the factoextra package documentation, you can get the top N contributing variables to both PC1 and PC2 by doing this:

pca_result1 = FactoMineR::PCA(X) # X is a dataframe/matrix with n rows (samples) and p cols (numeric values) 
contrib_PC12 = pca_result1$var$contrib[,1:2]  # get contributions of each variable to PC1 and PC2
eig_PC12 = pca_result1$eig[,1][1:2] # get the eigenvalues of both components
contrib_total = apply(contrib_PC12, 1, function(x) {sum(x*eig_PC12)/sum(eig_PC12)}) # calculate the total contribution
N = 20
names(sort(contrib_total, decreasing = TRUE)[1:N]) -> topN_PC12

Then topN_PC12 contains the variables you are interested in.

ADD COMMENTlink modified 12 days ago • written 25 days ago by guillermo.luque.ds40

Hello, @guillermo.luque.ds

I have just checked the way you suggested. However, there is an error:

Error in apply(contrib_PC12, 1, function(x) { : dim(X) must have a positive length
ADD REPLYlink written 13 days ago by luongthang190850
1

Hi @luongthang1908, in my example, I've assumed (perhaps wrongly) pca_result1 was the output of the PCA command from the FactoMineR library. I have updated the code so hopefully, this solves the problem for you.

ADD REPLYlink written 12 days ago by guillermo.luque.ds40

Hi @guillermo.luque.ds, Thank you so much for your healp. The FactoMineR::PCA(X) function works perfectly.

Does it mean that FactoMineR::PCA is not similar to the regular base PCA function?

ADD REPLYlink written 11 days ago by luongthang190850

Hi @luongthang1908, I will assume the regular function is prcomp. At the core, both functions use singular value decomposition to perform a principal component analysis of a given matrix (e.g. a gene counts table). However, the way they output the results differs. Now regarding the PCA function from FactoMineR, this link has some additional info maybe could be worthy for your analyses.

ADD REPLYlink written 10 days ago by guillermo.luque.ds40
1
gravatar for MatthewP
29 days ago by
MatthewP880
China
MatthewP880 wrote:

Add line and arrows from (0, 0) to pca$rotation on your PCA biplot. Ther higher absolute values in pca$rotation matrix means higher contribution. So longer line and arrows means higher contribution.

> head(mtcars)                                                                                                                                                                      [132/725]
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb          
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4          
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4          
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1          
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1          
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2          
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1                          
> pca <- prcomp(mtcars)                                                                       
> head(pca$rotation[,1:2])                                                                    
              PC1          PC2                                                                
mpg  -0.038118199  0.009184847                                                                
cyl   0.012035150 -0.003372487                                                                
disp  0.899568146  0.435372320                                                                
hp    0.434784387 -0.899307303
drat -0.002660077 -0.003900205
wt    0.006239405  0.004861023
ADD COMMENTlink written 29 days ago by MatthewP880
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1592 users visited in the last hour
_