Phylogenetic tree of gene clusters
Entering edit mode
7 months ago
A_heath ▴ 60

Hi all,

I would like to have your help and advices on how to estimate the evolutionary pattern of bacteria regarding specific gene clusters, please.

First, is it relevant to construct phylogenic tree on several genes (genes in clusters)? Or would it be better to select only one gene within these clusters and to compare them among several bacteria?

Is it better to use DNA or protein sequences for phylogeny?

Thank you for your much appreciated help!


evolution clusters phylogeny • 411 views
Entering edit mode

I'm afraid there isn't a simple or one-size-fits-all answer to your question, as it depends what the fundamental biological question you're asking is.

Since this is bacteria, it will also depend on what you think you know, if anything, about these genes. If they're prone to horizontal transfer for example, or they reside on plasmids etc, the phylogenetic signal will be borderline unintelligible.

If you think these are core and largely immobile proteins, whether to use DNA or protein largely comes down to the answer you're interested in, and how distantly related the strains/proteins are likely to be. If they are quite distantly related, it can be difficult to obtain good DNA alignments.

Entering edit mode

Thank you for your answer Joe.

Just to have a confirmation, when constructing phylogenetic trees on proteins, it is relevant only if a single protein (commons to other bacteria) is selected? In other words, I can not construct a tree using several proteins within the cluster, I have to select only one, right?

Thanks again for your help,


Entering edit mode

I'm not sure I fully understand, but no, you can (and often should) use multiple proteins. You can perform separate locus-wise alignments and then concatenate these alignments to create phylogenetic trees - this is similar to how MLST is done for example.

If you have a gene cluster though, you don't necessarily need to use the proteins, you could use the DNA sequence of the entire cluster, including intergenic regions - again, depends on your use case/question though.

Another approach you can use is to create separate trees for each of multiple genes, and then use phylogenetic congruency and consensus methods to derive an 'inferred' tree. This is what tools like ASTRAL-II are for, for example.


Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6