PCA and statistics on SNPmatrix
1
2
Entering edit mode
5.9 years ago
dominicdhall ▴ 40

I have a VCF file containing genotype date for a few thousand SNPs across a few thousand samples. I would like to firstly convert this to a matrix (possibly using the VariantAnnotation package) and then perform a PCA analysis on the samples followed by some sort of clustering algorithm. I have very little experience with any of SNP matrix packages, PCA or clustering algorithms so I was wondering if anyone knew of any good tutorials which may be able to help me.

It is also worth noting that due to the nature of the analysis I am running, the SNP matrix will be extremely sparse. I would therefore also like to get information on the fraction of missing genotypes for each sample and the fraction of missing samples for each SNP - is this possible?

PCA SNP • 1.8k views
ADD COMMENT
2
Entering edit mode
5.9 years ago
leeandroid ▴ 130

To pursue PCA you can use SNPRelate or TASSEL that has a user-frendly platform.

ADD COMMENT

Login before adding your answer.

Traffic: 3118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6