Entering edit mode
2.9 years ago
Tam
•
0
Hi everyone,
I'm trying to do PCA on a VCF dataset with about 200 samples and 800k SNPs using the SNPRelate library. However, the variance explained by the first component seems to be very low (around 0.5%), as opposed to the example given by the tutorial (which as I understand has far fewer SNPs). I just want to ask if this is to be expected, if anyone has done this kind of analysis before.
Thank you
Thanks Kevin. The samples are indeed from the same geographical region. By homogenous, I assume you mean, e.g. having similar ethnicities, right?
Yes, homogenous in that sense, i.e., a non-diverse ethnic pool. There may also be other reasons pertaining to any filtering that you have performed for, e.g., minor allele frequency or linkage disequilibrium.