Hey all, I'm new to using MAFFT. I've got it working on my machine using a docker image, and can get output from it. I have also used the online version here https://mafft.cbrc.jp/alignment/server/
What it is attempting to do is obviously a hugely computationally expensive process. Aside from choosing different algorithms, or renting time on a powerful HPC VM, are there ways to increase the speed that MAFFT works at?
Some other things I am wondering:
- When running it, I can set --thread 8 --threadtb 5 --threadit 0. However, when I use the command htop to check my core usage, only 2 or 3 cores are ever being used. Is there a way to force more parallelism? While I am using Docker to run it, I am running Docker using --cpuset-cpus="0-6", which should allow the program to access up to 7 cores. CPU use never gets above 40%.
- Is there a GPU enhanced version readily available, to get more performance?
- Is there any "trick-of-the-trade" that I am maybe unaware of right now, that can be used to squeeze a bit more performance from it?
Sticking to Docker images would be my preferred method - however, if there are builds of mafft out there that are Docker-ised yet but are faster, I'd be interested in using them.
In my experience MAFFT uses threads fairly efficiently, though not necessarily 100% of the time because some of its functions can't be parallelized. I would first suspect the disk speed relative to the speed of your processor, especially if MAFFT has to read large files. Not all the threads will be engaged if reading the files takes too long.
hey! thanks for the response! so lots of small files (might be?) be faster than 1 big one?