Question: Find particular SNPs affecting PCA
1
gravatar for lovestowell
4.7 years ago by
lovestowell10
United States
lovestowell10 wrote:

Hey there,

I have a set of ~13K SNPs across 16 individuals, 8 from each of two different species. Individuals were sequenced in two different sequencing runs. I am using the R package SNPRelate to calculate relatedness between pairs and to visualize the individuals in PCA space. The problem is that PCA is separating not only by species but also by sequencing run. This pattern persists even if I remove SNPs that are missing in many individuals.

I would like to find the particular SNPs that are separating the different sequencing runs and remove them from analyses. Any suggestions on finding the SNPs that are contributing to this pattern?

-S

 

snprelate snp pca myposts • 1.2k views
ADD COMMENTlink modified 5 months ago by rturba070 • written 4.7 years ago by lovestowell10

I'm sorry. This is my first time doing this kind of analysis, and I don't understand how the structure of your PCA object looks like. I'm using the SNPRelate package to do my PCAs, and the loading object looks like this:

List of 8
 $ sample.id : chr [1:3893] "AF346967.1" "AF346968.1" "AF346969.1" "AF346976.1" ...
 $ snp.id    : chr [1:3869] "A10005G" "A10018G" "A10032G" "A10039G" ...
 $ eigenval  : num [1:3893] 45.1 42.3 NaN NaN NaN ...
 $ snploading: num [1:2, 1:3869] -0.008308 -0.000411 0.000923 0.001307 0.001924 ...
 $ TraceXTX  : num 29776992
 $ Bayesian  : logi FALSE
 $ avefreq   : num [1:3869] 0.001027 0.000257 0.000771 0.001027 0.001027 ...
 $ scale     : num [1:3869] 44.1 88.2 51 44.1 44.1 ...
 - attr(*, "class")= chr "snpgdsPCASNPLoadingClass"

How can I extract the loadings for the PC that I want with its associated snp.id? And how is this different from the snp correlation analysis from this tutorial? Any help will be welcomed!

ADD REPLYlink modified 5 months ago • written 5 months ago by rturba070
1
gravatar for mkulecka
4.7 years ago by
mkulecka320
European Union
mkulecka320 wrote:

Your PCA object should contain something like matrix of variable loadings. If you extract it, you could see which variables are strongly correleated with PC1, PC2 etc. For example, for object of class prcomp it can be done like this:

#pcobj is PCA object
v <- pcobj$rotation
v <- as.data.frame(v)
v <- v[with(v,order(PC1)),]
v <- rbind(head(v),tail(v)) #correlation can be either negative or positive
ADD COMMENTlink modified 9 months ago by RamRS30k • written 4.7 years ago by mkulecka320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 740 users visited in the last hour