PCA with PLINK and SmartPCA: same imput file but different results in my plot. Why?
Entering edit mode
11 months ago

Hi everyone, and thank you in advance for any kind of help!

I'm trying to perform a Principal Component Analysis both with PLINK and smartPCA. The aim is to use PCs as covariates: they are necessary for the GWAS I have to run after the Principal Component Analysis step.

About PLINK, I used a pruned binary file with 632 ind and about 36500 SNPs, using the command --pca that returned my file .evec and .eval. Then I plotted my .evec file with R.

About smartPCA, I converted my pruned binary files into PED and MAP format and after, through CONVERTF, in EIGENSTRAT format, then adding my population labels in the last column of .ind file. I run smartPCA setting -k 10 and -m 0, hoping to obtain the same result obtained in PLINK. At the end, I used R to construct my final plot.

My plots are both on 632 ind and 36500 variants, but they don't correspond. .Evec values that in PLINK plot are positive, in smartPCA plot are negative, resulting in a reversed clustering along the y-axis (PC2).

Also, labels (and population groups) for same points don't match.

My question is: Why there is this difference?

There is an error plotting my results? Which it could be? O, maybe, it depends from a different way to calculate (in PLINK and smartPCA plot) eigenvector files ? They perform Principal Component Analysis in different ways?

Really thank you if you try to help me. Fran

PCA PLINK smartPCA EIGENSTRAT plot • 649 views
Entering edit mode

Principal component sign is meaningless and effectively random; what matters is how the points are oriented relative to each other.

Entering edit mode

And what about the labels for some individuals/points?

For example, the same point that in PLINK plot is referred to sample TD701, Norvegian population, in smartPCA is referred to sample 14798, Italian population. Thank you for your help.

Entering edit mode

That is not expected, but I can't tell you what went wrong there unless I have enough information to replicate your run.


Login before adding your answer.

Traffic: 1744 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6