Bwa mem is exiting with segmentation fault for large contigs (> 0.4Mbp)
2
0
Entering edit mode
10.0 years ago

Hello all,

I am using bwa mem to align contigs to a reference genome and I observed that it is exiting with an error message "Segmentation Fault". For example:

[M::main_mem] read 1 sequences (405327 bp)...
Segmentation fault

Upon closer examination, I found that whenever the size of input sequence >=0.4 Mbp, I see this message. In bwa mem page (http://bio-bwa.sourceforge.net/bwa.shtml), it is mentioned that it can handle contigs upto 1Mbp. I am unable to figure out the source of this upper limit at 0.4Mbp. Is it specific to my computer or is there any parameter by which I can assign more memory to bwa?
I am currently thinking of breaking contigs greater than 0.4Mbp into overlapping smaller contigs of size 399999. Is there any other way to go about it.

Thanks all,
Ramya

alignment sequencing next-gen software error • 6.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

@Pierre Lindenbaum I am aliging contigs of a hybrid to it's parents genomes. Thus, I have reason to believe that though the query and the reference will be very similar sequence wise(>95% similar), I expect to see gross chromosomal rearrangements. Thus I believe, blat which is suited for aligning at the 'DNA level between two sequences that are of 95% or greater identity, but which may include large inserts', may not work well for me. What are your thoughts?

@Chris Fields Oh so bwa-mem did run for such large contigs. Now I wonder !!! I ran bwa-mem with a single contig of size 405327 bp and it exited with 'Segmentation fault'. When I ran it for a single contig of size 397720 bp, it ran perfectly well, thus my conclusion. Maybe there was some parameter you changed while compiling the source code and I kept the default. Or did you run it on a system with more RAM than mine (mine is 16GB RAM). Just guesses.

I ran bwa-sw and it worked for me. So I can go ahead, at least for the time being.

Thanks,
Ramya

ADD REPLY
0
Entering edit mode

If BWA-MEM segfaults, it is a bug. Is your data public? Could you share the data with me? I will only use your data for the debugging purpose. Thank you.

ADD REPLY
0
Entering edit mode

@lh3-Alt Thanks for your help. I have prepared a folder 'Debug' with following files: (1) query_sample.fsa (1 sequence) (2) target_sample.fsa (3) files generated upon indexing the target_sample.fsa (4) readme.txt with the exact commands as I ran. The bwa-mem exits with a "Segmentation fault" for this problem as well. I have uploaded a tar file of this folder in my google drive (https://drive.google.com/file/d/0ByTU79pGWWI0TmJkaWZsem1MRnM/edit?usp=sharing) as I could not find a way to upload it in biostars itself.

Kindly get back in case you need more inputs.

Thanks,

Ramya

ADD REPLY
2
Entering edit mode

Thanks for the example. I see you were using bwa-0.7.0. Please try more recent versions. The first release of bwa-mem indeed has bugs, many of which have been fixed over time.

ADD REPLY
0
Entering edit mode

@lh3-Alt Yes, you were right. It ran perfectly with the latest version (bwa-0.7.8). Thankyou everyone for the inputs and help :-)

Yours Truly,
Ramya

PS: How do I accept this as the solution and close the thread?

ADD REPLY
0
Entering edit mode

Ah, one of the problems I have seen in Biostar, namely that the answers are in comments. Basically, @lh3-Alt could post that using the latest bwa-mem is the answer below, and you would accept it. Or you could post the answer and indicate @lh3-Alt answered it in the comments, then accept it :)

Also, see this thread regarding some reasoning why you can't accept a comment as an answer.

ADD REPLY
1
Entering edit mode
10.0 years ago

@Chris Fields Thanks.

For future reference, the suggestion by lh3-Alt (check the comments above) worked. I was using an older version of bwa and thus updating it to bwa-0.7.8 worked for me.

Thanks everyone,

Ramya

ADD COMMENT
3
Entering edit mode
10.0 years ago

to align contigs to a reference genome

bwa mem is not the right tool here. You'd better use blat

ADD COMMENT
3
Entering edit mode

Just to add, I have aligned scaffolds up to 7.5M (from an ALLPATHS-LG assembly) to a related mammalian genome using bwa mem. blat is certainly an option (we used it for liftover of annotations) but it certainly wasn't as fast.

ADD REPLY
2
Entering edit mode

BWA-MEM is the right tool here. The segfault should be a bug.

ADD REPLY

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6