Question

Phylogenetic Analysis

5

Entering edit mode

13.4 years ago

User 0063 ▴ 240

Hi all,

I'm new to bioinformatics. I need to make a phylogenetic analysis of a protein sequence. I'd like to use Maximum likelihood method. Could you give any advice to build a good MSA and tree? I mean software, tips and so on....

Thank you very much

phylogenetics multiple protein • 9.7k views

ADD COMMENT • link updated 13.1 years ago by Bishesh • 0 • written 13.4 years ago by User 0063 ▴ 240

0

Entering edit mode

Citing wikipedia: "Phylum" is adopted from the Greek φυλαί phylai, the clan-based voting groups in Greek city-states.

ADD REPLY • link 13.4 years ago by Michael 54k

score 10 · Answer 1 · 2010-11-16

10

Entering edit mode

13.4 years ago

Dave Lunt ★ 2.0k

If you really are new to bioinformatics I would sugest it will be easiest to use one of the excellent online phylogeny pipelines rather than choosing and installing programmes locally.

www.phylogeny.fr is excellent, and easy to use. It can do alignment using MUSCLE and fast maximum likelihood using PhyML on up to 200 protein sequences. In fact these are the defaults. Both these algorithms are among the best and this will be a fast and high quality tree.

There are other online options too (e.g. CIPRES), but Phylogeny-France is easy to use and high quality. It will even display the tree nicely at the end!

ADD COMMENT • link 13.4 years ago by Dave Lunt ★ 2.0k

1

Entering edit mode

CIPRES is great if you have a lot of big sequences (RAxML works good for that and they run the HPC version).

ADD REPLY • link 13.4 years ago by Gww ★ 2.7k

score 7 · Answer 2 · 2010-11-16

7

Entering edit mode

13.4 years ago

Steve Moss 2.3k

Using online systems isn't always the fastest way to achieve results, although it is generally much easier, as you are using a web interface to control the program inputs. If you are able to use your command terminal, then you can download binaries that will give you your results much faster (depending on the machine specs of course).

I'd use MAFFT as my first choice, or MUSCLE comes a close second (although FastTree recommends it), for building the alignments. However, I'd recommend FastTree over PhyML for building the trees.

FastTree approximates to maximum-likelihood, performing heuristic neighbour-joining using a minimal model of evolution, before maximizing the trees likelihood as detailed here. It also takes input in FASTA format, which means you don't need to convert to PHYLIP format, as with PhyML or PHYLIP.

Cheers,

Steve

ADD COMMENT • link 13.4 years ago by Steve Moss 2.3k

0

Entering edit mode

Thank you veruy much for your precious suggestions. I'd like to put also another question. I've performed a psi blast searching against nr protein db Which sequences should I use to build MSA?

Best regards

ADD REPLY • link 13.4 years ago by User 0063 ▴ 240

0

Entering edit mode

You would want to use the filtered PSI-BLAST output alignment files. Usually in the format queryname-originalfastafilename_psiali.fasta?

ADD REPLY • link 13.4 years ago by Steve Moss 2.3k

0

Entering edit mode

I think you would need to use the filtered PSI-BLAST output alignment files? Usually in the format queryname-originalfastaname_psiali.fasta. I'm note sure if you would need to remove the gaps first? Perhaps someone can clarify that though?

ADD REPLY • link 13.4 years ago by Steve Moss 2.3k

score 6 · Answer 3 · 2010-11-16

6

Entering edit mode

13.4 years ago

Paulo Nuin ★ 3.7k

You best option for alignment is MAFFT and I would recommend Phylip to calculate your tree, even though there might be some other faster options out there. In this case you would need a file converter, to convert from FASTA to the Phylip format.

ADD COMMENT • link 13.4 years ago by Paulo Nuin ★ 3.7k

score 4 · Answer 4 · 2010-12-08

First you need to understand what is Phylogenetics definition and according to Wikipedia it's simply an evolution relationship tree. Relationship here means the sharing of common features which can be furthermore define in terms of orthologs and paralogs. So phylogenetic trees can be orthologs or paralogs.

How we can start drawing phylogenetic tree by:-

Multiple sequence Alignement (MSA) - There are plenty enough alignment tools available online reliable one can be : ClustalW , T-Coffee or MUSCLE
Phylogency approach - Just by MSA you can't draw phylogenetic so it's important to apply phylogeny approach on our generated alignment which can be done by PHYLIP. PHYLIP has different methods like parsimony, distance matrix, maximum likelihood, bootstrapping and e.t.c. In your case you can use PROTDIST in particular. Similarly, for Bootstrap =>Seqboot, Maximum likelihood=>Proml, Consensus=>consense can be use.
Visualization - Finally, after phylogeny approach it's possible to generate phylogenetic tree. Best visualizing tools can be : TreeView , TreeDyn

Further, you can go through my question archieve which are mostly related with Phylogenetic tree generation approach

score 3 · Answer 5 · 2010-11-16

Use phylip or Mega or Tree-Puzzle softwares.

Simple pipeline for set of protein sequences

multi.fasta--> Alignment(clustal or muscle or t-coffe) --> Distmatrix (protdist)-->Bootstrap (Seqboot)-->maximum likelihood(Proml)-->Consensus(consense)-->Visualise the tree (Treeview)

or after alignment step use "Tree puzzle".

TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.

score 3 · Answer 6 · 2010-12-07

make all your sequence in a single file. each sequence should seperate with a single line and ">" symbol which indicate fasta format. and upload this single file in http://www.ebi.ac.uk/clustalw thn download the .aln file. Now you just download a free software called Jalview. then you can load your .aln file which contain phylogenetic tree in jalview. you can download jalview from the following link.http://www.jalview.org/download.html

Bowang

score 3 · Answer 7 · 2010-12-07

Clustalw2, T-coffee, and MUSCLE are all good MSA tools. They are pretty different in implementation.

To estimate the tree use Phylip. Phylip can apply Parsimony and Maximum likelihood (ML) methods.

You can also try to use the Bayesian estimation of phylogeny which is very different from the ML method. A good tool for that would be MrBayes. It's easy to install and run - it's probably worth a try. Here is a list of other ML and Bayesian phylogeny estimators: http://mrbayes.csit.fsu.edu/links.php