Question: Principle Component Analysis (PCA) for GWAS (SNP genotype data)
5.8 years ago by
United States
I'm doing a Genome Wide Association Study (GWAS) in R. I have SNP genotype data for 300 individuals. I have a total of 177,000 SNPs.

Before diving into the GWAS, I want to adjust for population stratification by doing a PCA analysis. I need to do a PCA that identifies the say top 10 principal components (PCs) and use them as covariates in the association analysis. Do you know of any R packages or software that will enable me to do this seamlessly?

I know about EIGENSTRAT that is implemented as part of the EIGENSOFT software. Has anyone used it before? I downloaded the software last night. Is there a tutorial on how to load/import the 177,000 SNPs into the software and do the PCA analysis. Do the SNPs need to be in a special format? Once I have the data, is it possible to import it into R? Do I need to write a wrapper to call the software? Any tips or information will be helpful. Thanks


gwas snp pca R • 6.2k views
ADD COMMENTlink modified 8 days ago by ricardoguerreiro212160 • written 5.8 years ago by samorjoy10
5.8 years ago by
Doha, Qatar
I have used SNPRELATE for one of such analysis few months back. You can try that once.
ADD COMMENTlink written 5.8 years ago by always_learning1.0k

good suggestion. thanks, I'll take a look at the manual for it and get back to you if I get stuck

ADD REPLYlink written 5.8 years ago by samorjoy10
8 days ago by
Did you find the answer? I think it's as simple as providing the n.PC = 10 parameter in your sommer::GWAS function. It does everything automatically.

Cheers, Ricardo

ADD COMMENTlink written 8 days ago by ricardoguerreiro212160
