Question: How To Compute Dn/Ds Ratio And Get Descriptive Statistics On Thousands Of Sequences ?
7.1 years ago
Hi Guys,

I'm looking for the best way to (1) compute dN/dS ratios on thousands couples of sequences and (2) get descriptive statistics on these thousands of sequences (length, GC3, etc...) at the same time?

What are the software available to perform that job?


To clarify: You are referring to the ratio of Non-synonymous vs. Synonymous substitution, correct?

7.1 years ago
You could use BioPerl to do this. Calculating dN:dS can be done in a BioPerl script by running PAML. See this page and slide 76 of this presentation To count the GC there is a script here and to modify it by codon position should be quite possible.

Or indeed, any of the Bio* projects: they all are well suited to building simple sequence analysis pipelines for many sequences.

7.1 years ago
Have a look at HyPhy(Hypothesis testing using Phylogenies). It has a specilized module for Positive and negative selection detection and which can be relevant to what you want to accomplish for a large dataset of sequences It might help you out.

