Question: BWA mem alignment taking a while in a virtual machine
4.2 years ago by
Robert Sicko570
Robert Sicko570 wrote:

I've processing targeted resequencing data from a Haloplex panel run on a MiSeq using Agilents Surecall software and generated variant calls. However, I want to generate calls using my own pipeline in GATK to compare to Surecall.

I'm having trouble with the run time for bwa mem run through virtual box.

Virtualbox * note, huge difference between real time and CPU time?

[main] Real time: 21280.065 sec; CPU: 2547.937 sec


[main] Real time: 572.116 sec; CPU: 375.354 sec

Surecall uses bma version: 0.7.5a-r405 - Windows port version: 1.2 and the following command:

bwa.exe, mem, -M, -D, 0,0, -B, 4, -A, 1.0, -w, 100, -k, 19, -R, @RG\tID:X\tSM:X, -t, 4, hg19.fasta, R1_Cut.fastq, R2_Cut.fastq, >, X.sam

In virtualbox I'm using bwa Version: 0.7.12-r103 and the following command:

bwa mem -t 6 -M -R @RG\tID:X\tSM:X human_g1k_v37.fasta.gz R1_trimpaired.fastq.gz R2_trimpaired.fastq.gz > X.sam

As for virtualbox I'm using Biolinux 8 with 8 cores and 5Gb of memory dedicated to it. I'm thinking maybe 5Gb is not enough and it's using paging, which is causing the slow down? I might be able to increase the amount allocated slightly, but the computer itself only has 8Gb total and I know the host OS will need some. The only other difference is the -D parameter used by Surecall in bma 0.7.5. I dug through bwa's git and found:
​"-D FLOAT drop chains shorter than FLOAT fraction of the longest overlapping chain [%.2f]\n", opt->drop_ratio);"

I did not see that in the bwa manual, so I did not specify it in my command (0 could be the default anyway as Surecall specified a few parameters with default values).






For human, 5.5G is the minimum. Better 6GB.

Thanks Heng! After running with 6GB allocated to the virtual machine the run times are much better.

[main] Real time: 231.878 sec; CPU: 741.689 sec

Any chance you could elaborate on what the Surecall parameter "-D, 0,0" is doing? Is this the default?


