Extracting divergence times from 4DTv
Entering edit mode
2.5 years ago
Macspider ★ 3.4k

Hello everyone,

I have just finished generating a 4DTv plot (example) for paralog and ortholog genes in a project I'm working on. While reading some papers where they also have made such a plot (listed in the introduction here), I see that many times they estimate divergence times between species using 4DTv sites. However, I haven't yet found any of these papers that says clearly what is done to compute divergence times from 4DTv.

4DTv plots show the ratio of transversions at fourfold degenerate sites in a set of pairwise alignments. As they show a ratio between the number of transversions and the total number of variants, this measure is used as a relative time measure to date back genome hybridization / duplication events.

Is there a way I could convert this to an absolute time measure? More in particular, I am interested in finding a way to convert the ratio of transversions into millions of years. What would I need to do that? What other variables should I have to make such calculation?

I am currently reading literature and books about time estimation models, but since that could take forever, I thought I'd might ask here as well :)

variants divergence time 4DTv substitution • 977 views
Entering edit mode
2.4 years ago
Macspider ★ 3.4k

I'm answering myself, for future readers:

The 4DTv ratio cannot be converted into millions of years, as it is a relative measure of time and therefore can't be converted into an absolute one.

However, one can use the rate of substitution at neutrally evolving sites to determine age. Basically:

  • parse a pairwise alignment in codons
  • select only codons belonging to the fourfold-degenerate group
  • extract third positions of each and count them (tot. positions)
  • extract the number of positions which differ between the two alignments (substitutions)
  • compute substitutions / tot. positions to get substitution rate
  • compare it with a known substitution rate per position per generation time (in years)

The resulting number should be an approximation of how many years have gone by. Be careful, because this assumes a constant mutation rate and can only be used when under the assumption that no differential mutatation rates have been present among species tested (i.e. almost never).

If you can't assume a constant mutation rate per generation, then you can still get a very rough picture of the divergence time in millions of years, knowing that it is imprecise.

Entering edit mode

You can also accept your own answer to close this thread


Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6