Question: PCA and statistics on SNPmatrix
2
gravatar for dominicdhall
12 days ago by
dominicdhall40
dominicdhall40 wrote:

I have a VCF file containing genotype date for a few thousand SNPs across a few thousand samples. I would like to firstly convert this to a matrix (possibly using the VariantAnnotation package) and then perform a PCA analysis on the samples followed by some sort of clustering algorithm. I have very little experience with any of SNP matrix packages, PCA or clustering algorithms so I was wondering if anyone knew of any good tutorials which may be able to help me.

It is also worth noting that due to the nature of the analysis I am running, the SNP matrix will be extremely sparse. I would therefore also like to get information on the fraction of missing genotypes for each sample and the fraction of missing samples for each SNP - is this possible?

snp pca • 89 views
ADD COMMENTlink modified 12 days ago by leeandroid80 • written 12 days ago by dominicdhall40
2
gravatar for leeandroid
12 days ago by
leeandroid80
leeandroid80 wrote:

To pursue PCA you can use SNPRelate or TASSEL that has a user-frendly platform.

ADD COMMENTlink written 12 days ago by leeandroid80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 972 users visited in the last hour