1000Genomes Distribution: Rate Of Synonymous And Nonsynonymous Snps
1
0
Entering edit mode
12.1 years ago
Joel • 0

Hi all,

The 1000 genomes paper indicates that there are about 10,000-11,000 non-synonymous sites and 10,000-12,000 synonymous sites per individual. I am trying to get a distribution of the variation in these numbers so I can do some statistics. For example, I would like to know if a particular region of interest is significantly enriched in nsSNPs compared to the average position in the genome. Is there a quick way to sample random segments of the exome and count sSNP and nsSNP rates?

Thanks, Joel

genome snp • 2.3k views
ADD COMMENT
1
Entering edit mode
12.1 years ago
Laura ★ 1.8k

The LOF files in the pilot release do have position info in them

You could use tabix to index them and then rapidly get pieces of both those files and the equivalent section of the vcf files to generate what ever stats you wanted for them

This would all be in NCBI36 coords though

ADD COMMENT

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6