Question: Identifying paralogs using phylogenetic tree
gravatar for ashish
3.1 years ago by
ashish420 wrote:

How do you identify paralogs from a phylogenetic tree. I was wondering if we can do it like this: 1. Perform BLAST and filter the results with less than 85% percentage identity. 2, create a phylogenetic tree with bootstrap. 3. filter out those results from blast output which do not group together in the phylogenetic tree.

but its not clear to me which ones are paralogs in phylogentic tree. See this tree (link), it a tree of a family of proteins from a single species, are the ones marked here in red paralogs? It would be really good if someone can elaborate about identifying paralogs in a phylogenetic tree. Thanks


paralogs phylogeny • 2.4k views
ADD COMMENTlink modified 3.1 years ago by Leo Martins220 • written 3.1 years ago by ashish420

You might want to take a look at: They did this for fungi and were able to track down paralogs, especially after the whole genome duplication event.

ADD REPLYlink written 3.1 years ago by Asaf7.6k

Apply the definition: two genes are paralogs if their last common ancestor is a duplication event.

ADD REPLYlink written 3.1 years ago by Jean-Karim Heriche22k
gravatar for Leo Martins
3.1 years ago by
Leo Martins220
Lausanne, Switzerland
Leo Martins220 wrote:

I guess you are asking about a gene tree reconciliation. Basically, if you know the species tree, then you can map all nodes from your gene tree -- the one you are estimating -- as speciations, duplications or losses. Therefore you would be able to find the sequences paralogous to the others. Notice that the gene tree may have several leaves labelled by (pointing to) the same species, while the species tree is uniquely labelled.

Assuming that you don't know the species tree, you can try to infer the species tree(s) that minimizes the number of duplications and losses, e.g. through the software iGTP . You can also use more sophisticated models for finding the species tree.

(PS: This is not what you are asking, but AFAIU the most common methods for finding orthologs do not rely on the gene family tree, and use pairwise distances instead)

ADD COMMENTlink written 3.1 years ago by Leo Martins220

Thanks for telling me the exact keywords for google search. This really helps. I will Read about the things you have mentioned. I wanted to ask if, after finding orthologs using pairwise distance, we create a tree using our query sequences and the potential orthologs from blast output. Now based on the tree if we remove those potential orthologs which do not group with our query sequences, will it make the final results better or are the sequences we filtered distant homologs and hence should not be removed.

ADD REPLYlink written 3.1 years ago by ashish420
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1450 users visited in the last hour