Entering edit mode
7.7 years ago
Rox
★
1.4k
Hi everyone !
I tried to use maker, but it was launched on only 1 of my CPU (on 32). I want to use maker "multi threadedly", but I don't see any --threads or -p options that I was used to.
I made some research, and it seems that we should use MPI instead : https://wiki.hpcc.msu.edu/display/Bioinfo/MAKER+Tutorial
I thought mpi was only for distant cluster (it's not my case, I want to launch everything on my own computer) and that it was also a nightmare to use for beginners.
What do you think about it ? Do you have any advices that could be useful for me ?
Thanks !
Roxane
I don't know if there are any benefits of using MPI on a single multi-core CPU. You have already referred to the headaches of having to install an MPI environment/compiling Maker to use MPI etc. Ultimately all your cores are sharing the same memory (not sure how much you have).
You may be able to split your input and start multiple independent jobs as a brute force way of parallelizing things. This may require plenty of RAM.
Hmmm I see, so how I'm I supposed to split my input then ? And most of all, how should I merge all my results files together? Also what does mean "plenty of RAM" for this job ? I think that my super-computer can handle it, but not sure though. So, except mpi, a faster way to run maker doesn't exist ? Because on the maker paper, it is claimed that it's multi-threaded, but it seem that this could be only achieved with mpi... Which is not satisfying for me. I think it would take ages to run maker without using multithreads... And when i need to run it several time in order to train abinitio tools, I will appreciate that the maker step could be faster than 4-5 days (maybe more because sadly, for memory space issues, the process was aborted...)
Disclaimer: I have not used maker so take the following with a grain of salt. With that out of the way
Your input for maker must be a multi-fasta file of sequences? You could split the file up into chunks and start multiple instances of maker. Based on the example you posted above 5GB per job seems to be required (unless this is a toy example and real jobs need more memory).
I don't know what is the size of your input dataset but this page seems to indicate the following. Using maker via iPlant initiative may be another option.