How can I obtain a list of SNPs generating clusters in a PCA?
1
0
Entering edit mode
5.5 years ago
Simo ▴ 50

Hi, I'm working with SNP Array data and after performing a PCA I ended up with two clusters in the same population. Since it's clear this division is showing two different genotyping plates, I would like to find out which SNPs are causing this situation. Do you have any suggestion?

On the other hand, have you ever found situations like this, and if so how did you manage it?

Thanks a lot

PCA SNPs Clusters • 1.3k views
3
Entering edit mode
5.5 years ago

> loadings<-snpgdsPCASNPLoading(PCA,genofile)
Working space: 3893 samples, 3869 SNPs
Using 1 (CPU) core.
Using the top 2 eigenvectors.
SNP Loading:    the sum of all working genotypes (0, 1 and 2) = 256360
List of 8
$sample.id : chr [1:3893] "AF346967.1" "AF346968.1" "AF346969.1" "AF346976.1" ...$ snp.id    : chr [1:3869] "A10005G" "A10018G" "A10032G" "A10039G" ...
$eigenval : num [1:3893] 45.1 42.3 NaN NaN NaN ...$ snploading: num [1:2, 1:3869] -0.008308 -0.000411 0.000923 0.001307 0.001924 ...
$TraceXTX : num 29776992$ Bayesian  : logi FALSE
$avefreq : num [1:3869] 0.001027 0.000257 0.000771 0.001027 0.001027 ...$ scale     : num [1:3869] 44.1 88.2 51 44.1 44.1 ...

0
Entering edit mode

Thank you, really :)

0
Entering edit mode

I'm sorry. This is my first time doing this kind of analysis, and I'm still having trouble on how to do this. I've tried looking all over. How can I extract the loadings for the PC that I want with its associated snp.id? And how is this different from the snp correlation analysis from this tutorial? Any help will be welcomed!