Question: PCA and statistics on SNPmatrix
gravatar for dominicdhall
8 months ago by
dominicdhall40 wrote:

I have a VCF file containing genotype date for a few thousand SNPs across a few thousand samples. I would like to firstly convert this to a matrix (possibly using the VariantAnnotation package) and then perform a PCA analysis on the samples followed by some sort of clustering algorithm. I have very little experience with any of SNP matrix packages, PCA or clustering algorithms so I was wondering if anyone knew of any good tutorials which may be able to help me.

It is also worth noting that due to the nature of the analysis I am running, the SNP matrix will be extremely sparse. I would therefore also like to get information on the fraction of missing genotypes for each sample and the fraction of missing samples for each SNP - is this possible?

snp pca • 415 views
ADD COMMENTlink modified 8 months ago by leeandroid80 • written 8 months ago by dominicdhall40
gravatar for leeandroid
8 months ago by
leeandroid80 wrote:

To pursue PCA you can use SNPRelate or TASSEL that has a user-frendly platform.

ADD COMMENTlink written 8 months ago by leeandroid80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1834 users visited in the last hour