Question: calculate the number of Non-Synomous and Synonymous SNP sites
3
gravatar for upendrakumar.devisetty
6.8 years ago by
United States
upendrakumar.devisetty390 wrote:

Hi, I have a VCF containing the SNP information between the two genotypes of interest and using that VCF and i have annotated the SNPs using snpeff annotation tool. SNPeff does a good job annotating the SNPs in terms calculating the number  of Non-Synonymous and Synonymous SNPs. One of the thing i am interested is the calculating the dN/dS ratio for each of the chromosome for the two genotypes. I did that, but some one recently told me that the before calculating dN/dS ratio, i should be estimating Non-Synonymous and Synonymous sites and then calculate dN/dS ratio. So i am wondering how do one go about estimating Non-Synonymous and Synonymous sites using VCF file?

Thanks

Upendra

snp • 5.7k views
ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 6.8 years ago by upendrakumar.devisetty390
2
gravatar for David W
6.8 years ago by
David W4.8k
New Zealand
David W4.8k wrote:

The d_N/d_S ratio is the ratio of the non-syn and synonymous substitution _rates_ in a region (not the raw counts of changes). So, your colleague is right in pointing out that you need to know about the number of sites that could generate each type of change. As it turns out that turning the counts into rates is not as straightforward as you might think. You can read something like "Hurst (2002). The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends in Genetics, 18:486-487" to work out which method to use and implement it

BUT are you sure d_N/d_S is going to tell you anything? The statistic was develpoped to understand protein evolution between divergent species - and doens't tell us very much about protein evolution within populations.

ADD COMMENTlink modified 14 months ago by Ram32k • written 6.8 years ago by David W4.8k

Thanks David. I will look into the paper. When I talked to some of my colleagues here, they suggested that I can construct a genome for each of the genotype and once this is done, there are softwares for aligning and estimating the Ka/Ks ratio. I agree that the dN/dS statistic was developed to understand protein evolution between divergent species but I would like to know how this statistic varies between the two genotypes that I am interested in as these genotypes are parents of a mapping population.

ADD REPLYlink modified 14 months ago by Ram32k • written 6.8 years ago by upendrakumar.devisetty390
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2143 users visited in the last hour
_