Question: PCA and statistics on SNPmatrix
gravatar for dominicdhall
2.4 years ago by
dominicdhall40 wrote:

I have a VCF file containing genotype date for a few thousand SNPs across a few thousand samples. I would like to firstly convert this to a matrix (possibly using the VariantAnnotation package) and then perform a PCA analysis on the samples followed by some sort of clustering algorithm. I have very little experience with any of SNP matrix packages, PCA or clustering algorithms so I was wondering if anyone knew of any good tutorials which may be able to help me.

It is also worth noting that due to the nature of the analysis I am running, the SNP matrix will be extremely sparse. I would therefore also like to get information on the fraction of missing genotypes for each sample and the fraction of missing samples for each SNP - is this possible?

snp pca • 1.1k views
ADD COMMENTlink modified 2.4 years ago by leeandroid90 • written 2.4 years ago by dominicdhall40
gravatar for leeandroid
2.4 years ago by
leeandroid90 wrote:

To pursue PCA you can use SNPRelate or TASSEL that has a user-frendly platform.

ADD COMMENTlink written 2.4 years ago by leeandroid90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2062 users visited in the last hour