Question: bwa mem segfault
0
gravatar for lindsay.liang
22 months ago by
lindsay.liang10 wrote:

Hi, I'm running bwa mem (v. 0.7.15) on some whole exome sequencing fastqs (paired end, illumina) and I'm getting a segmentation fault very early on in the run. Here's the command:

bwa mem -t 8 -M -R "@RG\tID:D658\tPL:ILLUMINA\tSM:D658" localDir/human_g1k_v37.fasta localDir/D658_S6_L001_R1_001.fastq.gz localDir/D658_S6_L001_R2_001.fastq.gz

Here's the last part of the output:

@SQ SN:GL000200.1   LN:187035
@SQ SN:GL000193.1   LN:189789
@SQ SN:GL000194.1   LN:191469
@SQ SN:GL000225.1   LN:211173
@SQ SN:GL000192.1   LN:547496
@RG ID:D658 PL:ILLUMINA SM:D658
@PG ID:bwa  PN:bwa  VN:0.7.15-r1140 CL:bwa mem -t 8 -M -R @RG\tID:D658\tPL:ILLUMINA\tSM:D658 localDir/human_g1k_v37.fasta localDir/D658_S6_L001_R1_001.fastq.gz localDir/D658_S6_L001_R2_001.fastq.gz
[M::process] read 800000 sequences (80000000 bp)...
Segmentation fault

I'm actually running this on an AWS ec2 instance m4.2xlarge, so there's 8 vCPUs and 32Gb of memory. So I don't think a lack of resources is a problem.

Any feedback would be much appreciated!

bwa bwa mem segfault • 1.3k views
ADD COMMENTlink written 22 months ago by lindsay.liang10

Hey Lindsay, segmentation faults are indeed usually related to memory or disk space, as you've implied. I believe that the standard space available on EC2 is 8GB - have you used pretty much all of that? Low disk space would be an issue too, which would provoke a segmentation fault.

The other thing that I'd check is to ensure that you have indexed the 1000 Genomes reference FASTA with the same version of bwa that you are using for alignment.

Edit: if you also recently upgrade the RAM on your EC2 instance, it may take a few hours to optimise and for this extra RAM to be available.

ADD REPLYlink modified 22 months ago • written 22 months ago by Kevin Blighe46k

Thanks for your response Kevin! When I first spun up the instance I added 20G of storage to it, so again I don't think that memory is the problem. My reference file was also indexed with 0.7.15 so that's not the issue either.

ADD REPLYlink written 22 months ago by lindsay.liang10
1

Okay, and are you sure that the FASTQ files are correctly formatted? Have you tried to even start the run on another computer? How much RAM appears available when you run the top command in BASH (look for 'KiB Mem' - exit top by pressing q)?

ADD REPLYlink written 22 months ago by Kevin Blighe46k

Fastq's look fine from first glance - R1 and R2 have the same number of lines (so there's an even number of reads), and just be using less they look properly formatted. I've spun up different instances and tried the run other times and there's no difference.

The mem line on top says Mem: 32949384k total, 23330372k used, 9619012k free, 131032k buffers.

ADD REPLYlink written 22 months ago by lindsay.liang10

What's using up 23.3 gigabytes of your RAM?!

ADD REPLYlink written 22 months ago by Kevin Blighe46k

Ah. Sorry, I'm indexing the reference (again) in the background just to make sure that wasn't the problem.

Ok, I started a new instance after the reindexing was done. Top's memory line now looks like this: KiB Mem : 32949384 total, 19087292 free, 126100 used, 13735992 buff/cache.

Also, I'm no longer getting a segfault error, but instead this:

[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 800000 sequences (80000000 bp)...
[E::bns_fetch_seq] begin=3442128365, mid=3442128366, end=3101804739, len=340323626, seq=0x7f9557b71010, rid=83, far_beg=3101257243, far_end=3101804739
bwa: bntseq.c:444: bns_fetch_seq: Assertion `seq && *end - *beg == len' failed.
Aborted
ADD REPLYlink modified 22 months ago • written 22 months ago by lindsay.liang10

This gets stranger by the minute! The same error was observed here: https://github.com/lh3/bwa/issues/120 Did you index from a compressed FASTA file?

Heng Li, developer of BWA, even wrote into the code for the program that "assertion failure should never happen" (see line 444 https://github.com/lh3/bwa/blob/master/bntseq.c#L444)

ADD REPLYlink written 22 months ago by Kevin Blighe46k

I just saw that! But alas no, my reference wasn't indexed (the command I used was just bwa index -a bwtsw human_g1k_v37.fasta).

ADD REPLYlink written 22 months ago by lindsay.liang10

The final few things that I an suggest are:

  • Use an older version of bwa
  • Try running it on your local machine (even if it's not sufficiently powered, just see if the alignment begins)
  • Ensure that all of your C libraries are installed (although this shouldn't be an issue if you downloaded the binary executable of bwa)
ADD REPLYlink written 22 months ago by Kevin Blighe46k
1

So it turns out that one of my index files got corrupted by a transfer from ec2 to aws s3 (and back again), so after I reindexed everything (again) and transferred the files (again) everything seems to be working fine. :|

(Just thought I'd post this for closure).

ADD REPLYlink written 22 months ago by lindsay.liang10

You deserve the up vote for that! :)

ADD REPLYlink written 22 months ago by Kevin Blighe46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1851 users visited in the last hour