Phylogenetic analysis for large size fasta sequences
2
0
Entering edit mode
4.2 years ago
MAPK ★ 2.0k

I am trying to run phylogenetic analysis for maximum likelihood tree. I have a large dataset (fasta sequences) which is just impossible to run in Mega. Is there a better option that I can resort to so I could use as many computing resources as I want. I need to use a minimum of 500gb ram to do this analysis and I was thinking to do this in HPC cluster. Any suggestion on the tool with multi-threading option would be really appreciated.

phylogenetics • 2.7k views
1
Entering edit mode

RaxML ? not sure on the multithreaded though

0
Entering edit mode

Muscle, MAFTT, T-Coffee should all be good alternatives assuming you have access to necessary hardware.

Edit: For step one. MSA.

0
Entering edit mode

Aren't those designed for sequence alignment?

0
Entering edit mode

Creating alignments for very large sequence datasets can be computationally very challenging and that is where MEGA could be struggling. Take a look at this publication to have a new perspective on this topic.

0
Entering edit mode

Alignment was rather easy. Mega was struggling with model testing steps and bootstrapping during maximum likelihood analysis.

0
Entering edit mode

you can use https://github.com/stamatak/standard-RAxML it works fine for me

0
Entering edit mode

Any new tools for this issue?

1
Entering edit mode

I recommend IQ-TREE2, which is very fast and you can use AUTO option that IQ-TREE automatically detects how many threats are needed and needs to be use.

1
Entering edit mode
2.0 years ago
Shalu Jhanwar ▴ 500

For phylogenetic analysis, MrBayes (http://nbisweden.github.io/MrBayes/) and FastTree (http://www.microbesonline.org/fasttree/#OpenMP) both supports multi-threading/parallelization.

0
Entering edit mode
4.2 years ago
h.mon 34k

ExaML, RAxML-NG or RAxML (from faster to slower). All three are from the same group, and all three support MPI parallelization, making them suitable to HPC clusters.