getting error when indexing bam files
1
1
Entering edit mode
4.9 years ago
Sara ▴ 240

I have aligned my fastq files to hg19 using this command:

bwa mem hg19.fa /DATA/myfile.fastq.gz

then I made bam files from sam files using:

samtools view -Sb myfile.sam > myfile.bam

then I sorted the bam files using:

samtools sort myfile.bam myfile.sorted.bam

and now I am trying to index the sorted bam files using:

samtools index myfile.sorted.bam

but I got this error:

[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.

do you know what the problem is?

genome alignment • 2.8k views
ADD COMMENT
2
Entering edit mode

What you should actually be doing is:

bwa mem hg19.fa /DATA/myfile.fastq.gz | samtools sort -o myfile.sorted.bam

and avoid all manual intermediate steps and unnecessary intermediate files.

ADD REPLY
0
Entering edit mode

Should add -h to every samtools view to get the headers. Not sure about the truncation

ADD REPLY
0
Entering edit mode
4.9 years ago

What version of samtools are you using? Newer versions do not just have the output file name dangling, it has to be either -o sort.bam or > sort.bam.

Anyway, the first thing to check is a quick samtools view sort.bam | head to confirm that your file at least looks sane at the beginning. That error means either you aren't giving it a real bam, or your file is for some reason truncated. If it looks okay at the beginning, its must have gotten garbled during the sort, so just redo that. But do like Wouster suggested with no intermediate files. Newer versions of samtools will sort a sam file and output a bam.

ADD COMMENT

Login before adding your answer.

Traffic: 3092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6