Question: Find particular SNPs affecting PCA
1
gravatar for lovestowell
3.3 years ago by
lovestowell10
United States
lovestowell10 wrote:

Hey there,

I have a set of ~13K SNPs across 16 individuals, 8 from each of two different species. Individuals were sequenced in two different sequencing runs. I am using the R package SNPRelate to calculate relatedness between pairs and to visualize the individuals in PCA space. The problem is that PCA is separating not only by species but also by sequencing run. This pattern persists even if I remove SNPs that are missing in many individuals.

I would like to find the particular SNPs that are separating the different sequencing runs and remove them from analyses. Any suggestions on finding the SNPs that are contributing to this pattern?

-S

 

snprelate snp pca myposts • 870 views
ADD COMMENTlink modified 3.3 years ago by mkulecka300 • written 3.3 years ago by lovestowell10
1
gravatar for mkulecka
3.3 years ago by
mkulecka300
European Union
mkulecka300 wrote:

Your PCA object should contain something like matrix of variable loadings. If you extract it, you could see which variables are strongly correleated with PC1, PC2 etc. For example, for object of class prcomp it can be done like this:

#pcobj is PCA object
 v <- pcobj$rotation
 v <- as.data.frame(v)
 v <- v[with(v,order(PC1)),]
 v <- rbind(head(v),tail(v)) #correlation can be either negative or positive
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by mkulecka300
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 884 users visited in the last hour