Samtools Index Segmentation Fault
0
2
Entering edit mode
8.7 years ago
Noushin N ▴ 600

Dear everyone,

I have received a couple of bam files processed by CASAVA pipeline (CASAVA-1.9.0) from a collaborator. I would like to create index files for them using samtools index. All but one seem to behave as expected. However, there is this one bam file which results in Segmentation Fault. The error can be reproduced in samtools releases 0.1.18 and 0.1.19 .

Please do not suggest running samtools sort first, as I have tried it and it results in the same segfault error.

I was wondering if anyone can help me out.

Thank you!

P.S. Following dpryan79's suggestion, here is the output of bt when I tried running index inside gdb:

#0  0x000000371a88ede3 in __memcpy_sse2 () from /lib64/libc.so.6
#1  0x0000000000426686 in bgzf_read (fp=fp@entry=0x6620a0, data=<optimized out>, length=1597059097) at bgzf.c:358
#2  0x000000000042d170 in bam_read1 (fp=fp@entry=0x6620a0, b=b@entry=0x682ba0) at bam.c:218
#3  0x000000000043192f in bam_index_core (fp=fp@entry=0x6620a0) at bam_index.c:182
#4  0x000000000043392e in bam_index_build2 (fn=0x7fffffffe3c1 "../../data/BAMS/SS6004353.bam", _fnidx=_fnidx@entry=0x0) at bam_index.c:484
#5  0x0000000000433a89 in bam_index_build (fn=<optimized out>) at bam_index.c:510
#6  bam_index (argc=<optimized out>, argv=<optimized out>) at bam_index.c:520
#7  0x000000371a821b75 in __libc_start_main () from /lib64/libc.so.6
#8  0x000000000040337d in _start ()

samtools index • 9.1k views
1
Entering edit mode

just a question, can you do a samtools flagstat in.bam or a samtools view in.bam|wc -l on that file?

0
Entering edit mode

samtools flagstat --> no

samtools view --> yes, and it returns ~19k for a WGS bam file about 169G.

Does this mean there is a buggy read/line that is causing the problem?

1
Entering edit mode

Yeah, it sounds like the file is just corrupt.

0
Entering edit mode

Thanks for all the great tips and the walk-thru! Does the output of bt indicate the same cause?

0
Entering edit mode

Yeah, in this case it does. It looks like the information specifying one of the compressed blocks is damaged, which is causing attempted access out of range.

0
Entering edit mode

Thanks a lot dpryan79!

0
Entering edit mode

Have you tried compiling samtools with debug symbols and then running it inside gdb? That would answer what's going wrong, since no one here will be able to give you more than a guess without a reproducible example.

0
Entering edit mode

Thank you for the prompt response. Can you please elaborate more on how I can do that? a pointer will be much appreciated!

3
Entering edit mode

It turns out that samtools has debug symbols by default, so that makes life easier :)

The general process for using gdb would be like

gdb samtools
#A bunch of stuff is printed by gdb
(gdb) run index some_file.bam


You'll then get an error at some point and can use commands like bt (print a backtrace) to find out exactly where the problem is happening. You could just update your post with the output of bt, since that'll give us all enough to get started.