Question: Handling bwa double free or corruption related core dump on an HPC
0
gravatar for rizoic
8 weeks ago by
rizoic230
rizoic230 wrote:

When I run bwa(v 0.7.17-r1188) through a scheduler on an HPC it randomly fails with a core dump and the log file mentions the error as glibc: double free or corruption.

A google search on the error pointed to a env variable MALLOC_CHECK_ whose value if changed can prevent this core dump. Accordingly I changed the value to 1 which should only put a warning in the log files and not abort.

With this the alignment works fine however I wanted to get more inputs on if it is advisable to change this env variable and would it potentially affect the accuracy of alignment in any way.

EDIT: Some details on the command and the operating system

The full command I am using is. bsub -n 6 -q <queue_name> 'bwa mem -t 6 Homo_sapiens_assembly38.fasta L001_R1_001.fastq.gz L001_R2_001.fastq.gz 2> log.txt|samtools sort -o test.bam' The scheduler is LSF and the operating system is RedHatEnterpriseServer(Linux 2.6.32). The executable was installed via make install and not through conda.

snp bwa • 117 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by rizoic230
1

Unusual error related to C rather than bwa mem itself suggesting something odd with your system. I would to the following:

  • first post full command lines and give some infos about the system you are on. Linux I guess, which one, and is this just the native environment or inside a conda environment or docker?
  • contact the cluster admin and ask if similar issues have already been reported
  • install anaconda, make a separate environment and installbwa via conda in there, so it should pull all relevant dependencies, maybe the libraries that are by default used on your cluster causing the trouble and the issue vanished with a conda version

The third suggestion is rather thinking aloud, I am really not a C/system expert, just something you can try on a Sunday since admin will not be available today :)

ADD REPLYlink written 8 weeks ago by ATpoint35k

Thanks for your comment. I updated the post with details of the command line and operating system.

The third suggestion actually makes sense. I did compile bwa newly from the source code but I think it still uses system libraries which may have an issue. Will try it with now with conda and see if the error is resolved.

ADD REPLYlink written 8 weeks ago by rizoic230
1

Did your sys admins install the program? How much memory are you allocating to this job? Looking at the command line it must be using the default (which may be set to a smaller value e.g. 4G). Can you explicitly add more memory (say 10+G) for the job and see if the error goes way.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by genomax84k

Thanks. Yes the program has been installed by the sys admin.

I have tried the following option to tell the scheduler to allocate more memory explicitly.

bsub -n 6 -q <queue_name> -R "rusage[mem=8024]" 'bwa mem -t 6 t.fasta 1.fastq.gz 2.fastq.gz 2> log.txt|samtools sort -o test.bam'

However the core dump still occurs. The odd thing is it will happen one out of 10 jobs and even for a very small i.e. 10kb test file.

ADD REPLYlink written 8 weeks ago by rizoic230
1

Can you try -M num_kb instead of -R?

ADD REPLYlink written 8 weeks ago by genomax84k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1799 users visited in the last hour