Question: Why don't phylogenetic analyses using DNA sequences and protein sequences result in identical trees?
gravatar for mafrah18
10 months ago by
mafrah1820 wrote:

I run phylogenetic analysis for the DNA sequences and phylogenetic analysis for the protein sequences using the rooted UPGMA method for the same Accession number “AB032107” ,but the trees not identical, why ?why the protein sequence and DNA sequence don't give the same tree? enter image description here

ADD COMMENTlink modified 28 days ago by Biostar ♦♦ 20 • written 10 months ago by mafrah1820

Hello mafrah18!

It appears that your post has been cross-posted to another site:

If this post is not yours then let us know, but it appears to be very similar.

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink modified 10 months ago • written 10 months ago by genomax34k
gravatar for Thomas
10 months ago by
Thomas80 wrote:

Because a DNA sequence and the equivalent polypeptide sequence does not contain the same information.

In the transition from DNA -> polypeptide, information is lost.

A codon will always map to the same single amino acid, but an amino acid does not map back just to one single codon. Hence, there is some ambiguity.

As phylogenetic trees, are constructed in part on the basis of the similarity between biological sequences, and polynucleotide and polypeptide sequences contains differing information - it is entirely possible to construct phylogenetic trees from the DNA and corresponding polypeptide sequence and get differing results.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Thomas80

This is true in general, but most DNA alignment and phylogenetic tree building methods that are used, at least the simple ones, are not codon models. The information content as it were, and complexity, of phylogenetic models would go as Codon > Protein > Nucleotide. Since the OP is doing UPGMA is it a simple nucleotide model, meaning there are actually fewer transition states than in the amino acid models. Codon models are a lot more complex, and aren't used nearly as often as they could be. It used to be computationally demanding and not doable for decent sized alignments

ADD REPLYlink written 10 months ago by Dan Gaston6.8k
gravatar for Dan Gaston
10 months ago by
Dan Gaston6.8k
Dan Gaston6.8k wrote:

@Thomas was on the right track with his comment, but as I mentioned in my comment on that post, in your case he has it backwards. Nucleotide models (which are different from a codon model) are less complex in terms of modelling evolutionary changes compared to a protein model. You're dealing with a 4x4 transition matrix instead of a 20x20 matrix. And similarly estimates for 4 nucleotide frequencies versus 20 amino acid frequencies. If your sequences are highly similar to one another, sometimes a nucleotide alignment and tree is more appropriate as synonymous changes will still be informative. However, as the diversity of your sequences increases protein alignments will become more informative because they have much more information content. You could also go to a full codon model, although I don't know if any are implemented in the software you may be using. Codon alignments can also get a little tricky to do if you aren't familiar with the methods.

ADD COMMENTlink written 10 months ago by Dan Gaston6.8k
gravatar for natasha.sernova
10 months ago by
natasha.sernova2.5k wrote:

To build any tree you need multiple sequence alignment.

In my opinion this post describes many detailes you need.

Multiple Alignment: Protein Or Nucleotide Sequence?

See also this discussion below - it might be helpful.

ADD COMMENTlink modified 10 months ago • written 10 months ago by natasha.sernova2.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1547 users visited in the last hour