Question: Examining Principal Components of Genetic Ancestry in R
0
gravatar for tesfadej2003
2.4 years ago by
tesfadej200310
tesfadej200310 wrote:

Hi All,

I want to examine the population structure using genotyped data. Using the smartpca program from the eigensoft package, I calculated principal components of genetic data using 'smartpca -p mydata > pca.log' command. Finally, I have got a lot of stuff in the file name 'pca.log' including removal of outliers (5 iteration), Tracy-Widom statistics, top 10 eigvalues, 10 eigenvector..etc.

Now, i don't have any idea how to plot the population structure in R software package. Does anyone have R script to plot using the first two principal components of genetic ancestry?

Thank you for your time!

pca R • 1.6k views
ADD COMMENTlink modified 2.4 years ago by anp375160 • written 2.4 years ago by tesfadej200310
1
gravatar for Jean-Karim Heriche
2.4 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche18k wrote:

I think this page has what you're looking for.

ADD COMMENTlink written 2.4 years ago by Jean-Karim Heriche18k

First of all, thank you very much for your helpful hint. I have seen it the link and tried to adopt it. However, it is not working. I understood all the outputs from 'smartpca -p Itamba.par > Itamba.out' should be in the file 'Itamba.out'. So, 'Itamba_PCA.evec' and 'Itamba_PCA.eval' were not created as separate output file and I couldn't find inside 'Itamba.out' output file. Is there a way to create these two output files from the main output file ('Itamba.out')? Becuase it has been used separately.

Thank you in advance.

ADD REPLYlink written 2.4 years ago by tesfadej200310

Not all the output is in one file. The output file names are specified in a configuration file. In this example, the configuration file is called Itamba.par and in this file, the file 'Itamba_PCA.evec' (containing the PCs/eigenvectors) should be specified as value for the parameter named 'evecoutname' i.e. as
'evecoutname:'Itamba_PCA.evec'. Same for eigenvalues and parameter 'evaloutname'. These parameters should be required so check that you have specified the correct file name/path and/or that you don't get any error message.

ADD REPLYlink written 2.4 years ago by Jean-Karim Heriche18k

Now, it is fine. I am grateful. I owe you.

ADD REPLYlink written 2.4 years ago by tesfadej200310
0
gravatar for anp375
2.4 years ago by
anp375160
anp375160 wrote:

Samples Eigenvector1 Eigenvector2 Eigenvector3 ... EigenvectorX

Sample1 Xcoord_____Ycoord_____Zcoord

Sample2 Xcoord_____Ycoord_____Zcoord

. . .

SampleN Xcoord_____Ycoord_____Zcoord

Plot 1 point for each sample using two of the three coordinates. Or plot all three at the same time using this package: https://cran.r-project.org/web/packages/pca3d/pca3d.pdf

ADD COMMENTlink written 2.4 years ago by anp375160

Thank you. You gave me hint to plot 3D. What I need is to plot 2D?

ADD REPLYlink written 2.4 years ago by tesfadej200310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1121 users visited in the last hour