samtools index command running slow and truncated EOF warnings
1
0
Entering edit mode
4.3 years ago

Hi all,

I am new to FASTA FASTQ SAM BAM and related explorations.

I am learning on the go, I apologies for any lack of substance.

I am working on a human read, my whole human genome sequencing, downloaded on the service provider's website.

What I have is a:

  • BAM file gz compressed
  • BAI file gz compressed
  • FASTQ R1 file gz compressed
  • FASTQ R2 file gz compressed

To speed up things I decompressed all the fles, this made me run through the truncated EOF error on samtools. I dont have any error when I use the *.gz files.

Is there a way to avoid that? I tried to manually force the EOF, but I still get the warning and errors using VIEW samtools command, essential command.

But what is puzzling me at the moment is the CPU usage of samtools jobs. If I use the *.gz files, 25% of each core is used , if I use the uncompressed files, 2 to 5 % of the core is used (I tried the -@ INT flag, nothing changes).

Is that normal?

As an example, when I run the command:

  • samtools index -@ INT file.bam.gz >>>>> 25%
  • samtools index -@ INT file.bam >>>>>> 2-5 %

Many thanks to all :)

next-gen index samtools eof truncated • 1.6k views
ADD COMMENT
0
Entering edit mode

Thank you Pierre,

Thank you for the random access precision, I'll keep it in mind next time I compress a Jay-z flow.

What about the low core usage and EOF warnings//errors?

ADD REPLY
0
Entering edit mode

For whom concerned and substance,

  • gz is BAM, not gzip and;
  • samtools was graciously doing nothing when fed with a SAM, decompressed gz.
ADD REPLY
0
Entering edit mode
4.3 years ago

bam file should be already compressed using the BGZ format, you don't need to recompress it with gzip (which is incompatible with bgzf, gzip cannot do random-access)

so

samtools index your.bam

should be enough

ADD COMMENT

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6