Phylogenetic Tree And Ensembl
3
2
Entering edit mode
13.4 years ago
Jamand ▴ 110

Hi all, I'm building a phylogenetic tree, based on protein sequences. I've selected sequences using PSI blast and I 've built an ML tree. Then I visited ensembl site, from entrez gene and I found orthologous an paralogous sequences that weren't listed in PSI blast result. Could you kindly suggest me the different results in ensembl? Do you suggest me to build a phylogenetic protein sequences tree based on ensembl orthologs and paralogs?

best regards

phylogenetics ensembl orthologues • 4.6k views
ADD COMMENT
0
Entering edit mode

Do you mean "explain the different results" by the first "suggest"?

ADD REPLY
0
Entering edit mode

post the sequence ids to look at

ADD REPLY
0
Entering edit mode

So, this question is not really about phylogenetics and trees, it's about why you get different results from blasting against (which database did you blast against) and para- orthologous genes annotated in a database.

ADD REPLY
0
Entering edit mode

I think it's about blasting protein sequence finding their homologous and orthologus and generating their respective phylogenetic tree.

ADD REPLY
4
Entering edit mode
13.4 years ago

Hi,

The Ensembl pipeline starts with protein sequences (the longest protein for every gene in Ensembl), and calculates BLAST reciprocal hits. After that, M-coffee and TreeBest are used to make the tree, and to determine homology relationships. The full pipeline is here:

http://www.ensembl.org/info/docs/compara/homology_method.html

Without knowing what group of proteins you are starting with, or your PSI BLAST parameters, I'm guessing that initial step of BLAST+Smith Waterman is picking up more relationships. You're welcome to try our pipeline.

ADD COMMENT
0
Entering edit mode
13.2 years ago
Dror ▴ 280

I would go with more ortholog control database like: orthoMCL or inparanoid, to blast against their databases and look for orthologs. This will give you a better over-all look for orthologs of your gene. Ensembl is nice, but I prefer relying on either ENTREZ refseq proteins or uniprotKB, to avoid duplication, and more controled blast and orthologs groups. plus, they have better interface.

ADD COMMENT
0
Entering edit mode
12.1 years ago

Most probably the differences between your gene tree and the Ensembl genetree are due to differences in the methods used to analyse such data.

The parameters used to include or exclude sequences into a gene family can make it small but tightly aligned or large but sparsely aligned.

Ensembl uses a combination of multiple aligners and TreeBest to generate the tree from the alignments:

http://www.ensembl.org/info/docs/compara/homology_method.html

ADD COMMENT

Login before adding your answer.

Traffic: 2481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6