I am using the Mira assembler in some iontorrent readings. Could someone explain to me how the parameter "number_of_threads" works? What does "threads" mean? I could not understand the explanation of the manual.
What value would you recommend me to use for this parameter?
Is this due to my machine's lack of processing power?
What are you machine specs? MIRA is very resource hungry, and excess sequencing coverage can cause an explosion of memory consumption. The manual recommends down-sampling to ~100x coverage.
A thread is a unit of execution on your processor. Using multithreading can increase execution speed of the job. Set it as high as you want, depends on the number of available threads. On Linux the command nproc tells you how many you have. The speed increase is not linear though since at some point you are restricted by I/O bottlenecks.
From the manual:
-t integer Number of threads to use. The default value of 0 is configured to automatically use up to 4 CPU cores (if present). Numbers higher than 4 (or maybe 8) will probably not make much sense because of diminishing returns.
So I would start with 4 if your machine supports that.
As a remark, please be sure to not open a question for every parameter of MIRA you have a questions about, leave it in this thread and use ADD COMMENT. We will close further threads unless they are fundamentally different from this one here. This is not meant to be impolite put intends to keep content focused in a single post thread rather than being split over multiple ones.
The manual also talks about read lengths as it is a platform-dependent measure. It says that one should use a value that is approproate for the platform. For IT the read length is often 100-300bp from what I know. Simply check read length distribution with e.g. fastqc and then set it accordingly. From what I understand the tool will ignore reads below the set threshold. It probably does not matter too much. If like 95% reads are longer than 200bp then set it to 200. I am not a user of this tool, this is more a "thinking aloud".
MIRA had an active mailing list, where its author (Bastien Chevreux) and several highly knowledgeable and skilled users promptly answered questions about MIRA problems and optimization. The list seems dead - you may try your luck subscribing and posting, anyway. Sadly, searching the archives is broken, but you may still browse the list. As the monthly volume is small, you may quickly find some posts on Ion Torrent.
Guys, how are you? Could you help me with this question?
Upon reading the MIRA manual I realized that it has a parameter called "passes". My interpretation is that "passes" would be the number of times that MIRA would be executed, improving with each "pass". I'm wrong?
Guys, it's me again.
I performed an execution in MIRA and the log returned the following message:
"Out of memory detected, exception message is: std :: bad_alloc"
Is this due to my machine's lack of processing power?
What are you machine specs? MIRA is very resource hungry, and excess sequencing coverage can cause an explosion of memory consumption. The manual recommends down-sampling to ~100x coverage.
My machine has a seventh generation Intel Core i7 processor and dedicated 2GB video memory.
The Log file did not indicate coverage problems with the readings, but identified chimeric read in the file.