Question: RAXML phylogenetics analysis
gravatar for MAPK
2.5 years ago by
MAPK1.7k wrote:

I tried to run RaxML( tool to generate phylogenetic tree (Maximum Likelihood tree) using the command below. I ran 100 bootstraps and got the tree, but the bootstrap value on the tree is 100 for all branches. I compared the same data and made the ML tree with mega which gave me similar topology but completely different bootstrap values. Could someone please help me if there is anything I am doing wrong with my commands below. Thanks for your help in advance.

Here is my aligned fasta file:

Performed model test using PROTGAMMAAUTO command:

raxmlHPC-PTHREADS -s test_mpk.fas -n mpktreeml -m PROTGAMMAAUTO -p 84381764921 -T 20

Then, I ran 100 boostrap trees:

raxmlHPC-PTHREADS -s test_mpk.fas -n mpktreeml_bootstrap_r -N 100 -m PROTGAMMAJTT -p 427482396541 -T 20

concatenated all tree files: cat mpktreeml_bootstrap_r* > allBootstraps

Tested majority rule consensus:

raxmlHPC-PTHREADS -z allBootstraps -m PROTGAMMAJTT -I autoMRE -n TEST -p 3824142315 -T 20

Then finally, got the tree:

raxmlHPC-PTHREADS -f b -z allBootstraps -t mpktreeml_bootstrap -m PROTGAMMAJTT -n mpkBOOTSTRAP.txt

I then used itol to view mpkBOOTSTRAP.txt which looks like this:

The tree above looks good except it doesn't show correct bootstrap values compared to the tree generated by mega:

raxml • 2.1k views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by MAPK1.7k
gravatar for Joe
2.5 years ago by
United Kingdom
Joe18k wrote:

Are you sure your commands are right?

According to the manual:

Step 4: Bootstrapping Now let's conduct a simple bootstrap analysis. Initially, let's try to find the best-scoring ML tree for a DNA alignment. We refer to this as the best-scoring tree because the ML search problem is computationally hard and we can thus generally not find the optimal tree under ML for a given alignment.

Let's execute: raxmlHPC -m GTRGAMMA -p 12345 -# 20 -s dna.phy -n T13 This command will generate 20 ML trees on distinct starting trees and also print the tree with the best likelihood to a file called RAxML_bestTree.T13. Now we will want to get support values for this tree, so let's conduct a bootstrap search: raxmlHPC -m GTRGAMMA -p 12345 -b 12345 -# 100 -s dna.phy -n T14 We need to tell RAxML that we want to do bootstrapping by providing a bootstrap random number seed via -b 12345 and the number of bootstrap replicates we want to compute via -# 100. Note that, RAxML also allows for automatically determining a sufficient number of bootstrap replicates, in this case you would replace -# 100 by one of the bootstrap convergence criteria -# autoFC, -# autoMRE, -# autoMR, -# autoMRE_IGN.

Having computed the bootstrap replicate trees that will be printed to a file called RAxML_bootstrap.T14 we can now use them to draw bipartitions on the best ML tree as follows: raxmlHPC -m GTRCAT -p 12345 -f b -t RAxML_bestTree.T13 -z RAxML_bootstrap.T14 -n T15. This call will produce to output files that can be visualized with Dendroscope: RAxML_bipartitions.T15 (support values assigned to nodes) and RAxML_bipartitionsBranchLabels.T15 (support values assigned to branches of the tree). Note that, for unrooted trees the correct representation is actually the one with support values assigned to branches and not nodes of the tree!

the -# flag governs bootstrap replicates, but it looks to me like you're using -N?

ADD COMMENTlink written 2.5 years ago by Joe18k

Isn't -# same as -N? Here is what they mentioned " The current MPI version only works properly if you specify the ­#  or ­N option in the command line, since it has been designed to do multiple inferences or rapid/standard BS (bootstrap) searches in parallel!"

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by MAPK1.7k

You might be right. I've only ever seen the hash used personally, thought I doubt thats really the problem.

Do you really need to use the PTHREADS binary anyway? Your tree isn't very large, I haven't looked at the fasta file, but at 18k I'd guess that's not very large either.

Personally, ever since I discovered IQ-Tree, I stopped using raxml (which to my mind has one of the most confusing CLIs in existence).

ADD REPLYlink written 2.5 years ago by Joe18k

Yes I need to do this in HPC. The data I am sharing here is just a mock data. I have a very large data to be analyzed so would like multithreading option.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by MAPK1.7k

I can't see anything on that manual page about -N, though I'm not at a terminal to check it myself. Might be worth just trying it with -# instead to see if that solves it?

ADD REPLYlink written 2.5 years ago by Joe18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1057 users visited in the last hour