Tool:Sambamba: High Performance Alternative For Samtools And Picard For Indexing, Sorting And Merging Bam Files
0
13
Entering edit mode
7.7 years ago
William ★ 4.9k

Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with BAM files. Current functionality is an important subset of samtools functionality. Because of efficient use of modern multicore CPUs, usually Sambamba is much faster than samtools. For example, indexing an 18 Gb BAM file on a fast 8 core machine utilizes all cores at 45% CPU:

Sambamba index bam:

time ~/sambamba index /scratch/HG00119.mapped.ILLUMINA.bwa.GBR.exome.20111114.bam
real    1m42.930s
user    6m19.964s
sys     0m32.362s


Samtools index bam:

time ~/samtools index /scratch/HG00119.mapped.ILLUMINA.bwa.GBR.exome.20111114.bam
real    5m37.669s
user    5m9.127s
sys     0m13.605s


https://github.com/lomereiter/sambamba

bam Tool • 5.8k views
1
Entering edit mode

How many threads were used for the sambamba time?

1
Entering edit mode

I would also like to know how many concurrent threads were used, but assuming only the userspace code was multithreaded we can do (real - sys) / user which is approximately 5. If the ~45% utilization figure is correct, then 5 * 1.55 = 7.75, so approximately 8 threads.

0
Entering edit mode

Completed the quote with the thread info.

0
Entering edit mode

Isn't disk IO the main bottleneck in this operation?

0
Entering edit mode

I guess that depends on the storage setup used. The faster the storage you use, the more the speedup is (see results for indexing). https://github.com/lomereiter/sambamba/wiki/Comparison-with-samtools

0
Entering edit mode

How to install and use it correctly? I have:

user@user-MS-7817:~/Documents/sambamba/sambamba$make sambamba-ldmd2-64 make: *** No rule to make target sambamba-ldmd2-64'. Stop. user@user-MS-7817:~/Documents/sambamba/sambamba$
`