Question: PCA and statistics on SNPmatrix
2
gravatar for dominicdhall
6 months ago by
dominicdhall40
dominicdhall40 wrote:

I have a VCF file containing genotype date for a few thousand SNPs across a few thousand samples. I would like to firstly convert this to a matrix (possibly using the VariantAnnotation package) and then perform a PCA analysis on the samples followed by some sort of clustering algorithm. I have very little experience with any of SNP matrix packages, PCA or clustering algorithms so I was wondering if anyone knew of any good tutorials which may be able to help me.

It is also worth noting that due to the nature of the analysis I am running, the SNP matrix will be extremely sparse. I would therefore also like to get information on the fraction of missing genotypes for each sample and the fraction of missing samples for each SNP - is this possible?

snp pca • 282 views
ADD COMMENTlink modified 6 months ago by leeandroid80 • written 6 months ago by dominicdhall40
2
gravatar for leeandroid
6 months ago by
leeandroid80
leeandroid80 wrote:

To pursue PCA you can use SNPRelate or TASSEL that has a user-frendly platform.

ADD COMMENTlink written 6 months ago by leeandroid80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2200 users visited in the last hour