Question: samtools index command running slow and truncated EOF warnings
0
gravatar for vascoambrogi
13 months ago by
vascoambrogi0 wrote:

Hi all,

I am new to FASTA FASTQ SAM BAM and related explorations.

I am learning on the go, I apologies for any lack of substance.

I am working on a human read, my whole human genome sequencing, downloaded on the service provider's website.

What I have is a:

  • BAM file gz compressed
  • BAI file gz compressed
  • FASTQ R1 file gz compressed
  • FASTQ R2 file gz compressed

To speed up things I decompressed all the fles, this made me run through the truncated EOF error on samtools. I dont have any error when I use the *.gz files.

Is there a way to avoid that? I tried to manually force the EOF, but I still get the warning and errors using VIEW samtools command, essential command.

But what is puzzling me at the moment is the CPU usage of samtools jobs. If I use the *.gz files, 25% of each core is used , if I use the uncompressed files, 2 to 5 % of the core is used (I tried the -@ INT flag, nothing changes).

Is that normal?

As an example, when I run the command:

  • samtools index -@ INT file.bam.gz >>>>> 25%
  • samtools index -@ INT file.bam >>>>>> 2-5 %

Many thanks to all :)

ADD COMMENTlink modified 13 months ago • written 13 months ago by vascoambrogi0

Thank you Pierre,

Thank you for the random access precision, I'll keep it in mind next time I compress a Jay-z flow.

What about the low core usage and EOF warnings//errors?

ADD REPLYlink written 13 months ago by vascoambrogi0

For whom concerned and substance,

  • gz is BAM, not gzip and;
  • samtools was graciously doing nothing when fed with a SAM, decompressed gz.
ADD REPLYlink written 13 months ago by vascoambrogi0
0
gravatar for Pierre Lindenbaum
13 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

bam file should be already compressed using the BGZ format, you don't need to recompress it with gzip (which is incompatible with bgzf, gzip cannot do random-access)

so

samtools index your.bam

should be enough

ADD COMMENTlink written 13 months ago by Pierre Lindenbaum133k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1829 users visited in the last hour
_