Question: Phylogenetic Tree And Ensembl
gravatar for Jamand
10.0 years ago by
European Union
Jamand110 wrote:

Hi all, I'm building a phylogenetic tree, based on protein sequences. I've selected sequences using PSI blast and I 've built an ML tree. Then I visited ensembl site, from entrez gene and I found orthologous an paralogous sequences that weren't listed in PSI blast result. Could you kindly suggest me the different results in ensembl? Do you suggest me to build a phylogenetic protein sequences tree based on ensembl orthologs and paralogs?

best regards

ADD COMMENTlink modified 8.7 years ago by Arelicorlaior50 • written 10.0 years ago by Jamand110

Do you mean "explain the different results" by the first "suggest"?

ADD REPLYlink written 10.0 years ago by Michael Schubert7.0k

post the sequence ids to look at

ADD REPLYlink written 10.0 years ago by Rm8.0k

So, this question is not really about phylogenetics and trees, it's about why you get different results from blasting against (which database did you blast against) and para- orthologous genes annotated in a database.

ADD REPLYlink written 10.0 years ago by Michael Dondrup48k

I think it's about blasting protein sequence finding their homologous and orthologus and generating their respective phylogenetic tree.

ADD REPLYlink written 10.0 years ago by Thaman3.3k
gravatar for Giulietta - Ensembl Helpdesk
10.0 years ago by
Cambridge, UK


The Ensembl pipeline starts with protein sequences (the longest protein for every gene in Ensembl), and calculates BLAST reciprocal hits. After that, M-coffee and TreeBest are used to make the tree, and to determine homology relationships. The full pipeline is here:

Without knowing what group of proteins you are starting with, or your PSI BLAST parameters, I'm guessing that initial step of BLAST+Smith Waterman is picking up more relationships. You're welcome to try our pipeline.

ADD COMMENTlink written 10.0 years ago by Giulietta - Ensembl Helpdesk1.2k
gravatar for Dror
9.8 years ago by
Dror280 wrote:

I would go with more ortholog control database like: orthoMCL or inparanoid, to blast against their databases and look for orthologs. This will give you a better over-all look for orthologs of your gene. Ensembl is nice, but I prefer relying on either ENTREZ refseq proteins or uniprotKB, to avoid duplication, and more controled blast and orthologs groups. plus, they have better interface.

ADD COMMENTlink written 9.8 years ago by Dror280
gravatar for Arelicorlaior
8.7 years ago by
Arelicorlaior50 wrote:

Most probably the differences between your gene tree and the Ensembl genetree are due to differences in the methods used to analyse such data.

The parameters used to include or exclude sequences into a gene family can make it small but tightly aligned or large but sparsely aligned.

Ensembl uses a combination of multiple aligners and TreeBest to generate the tree from the alignments:

ADD COMMENTlink written 8.7 years ago by Arelicorlaior50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1824 users visited in the last hour