Calculating the number of synonymous and non-synonymous sites for genes
1
0
Entering edit mode
7.1 years ago
chlamystar • 0

Hi all,

I have a list of the numbers of synonymous and non-synonymous SNPs for a large set of genes (please note - do not have the actual SNP sequences just the amounts).

I would like to now calculate the ka/ks ratio however to do so I need to calculate the numbers of synonymous and non synonymous sites in each gene. I can find a lot of information on how to calculate the ka/ks ratio but not for how to get how many sites have the potential to host either synonymous or non-synonymous substitutions.

Any advice how to do so? Especially bearing in mind I am not great at programming - I can run R but do not have access at the moment to things like matlab.

Thanks in advance for any feedback - am a first time user so happy to give more info. Daisy

SNP synonymous kaks • 3.0k views
ADD COMMENT
0
Entering edit mode
7.1 years ago

Unless you are certain about gene sequences and need very precise data, there is a workaround: assume that ratio of synonymous sites to nonsynonymous per a genomic region, say of 1000 nucleotides, is constant, then Ka/Ks is approximately your synonymous/nonsynonymous SNP data multiplied by that constant. That constant is the same between genes in first approximation and you can rank genes using synonymous/nonsynonymous SNP as if it is Ka/Ks. Now you need to "normalize" by looking at genes you have a neutral selection for sure. Instead of normalization you can test if ratio for a particular gene or set of genes is different from that "norm" that corresponds to neutral selection using chi-squared test or similar statistic. Data might need to be filtered a bit to remove outliers. This is simplified aproach, but other more precise approaches can be used too.

ADD COMMENT

Login before adding your answer.

Traffic: 2673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6