Entering edit mode
If you use the most recent samtools version from the github repository you'll be able to set unit of memory and use threads, makes a massive difference!
https://github.com/samtools/samtools
$ samtools sort
Usage: samtools sort [options] <in.bam> <out.prefix>
Options: -n sort by read name
-f use <out.prefix> as full file name instead of prefix
-o final output to stdout
-l INT compression level, from 0 to 9 [-1]
-@ INT number of sorting and compression threads [1]
-m INT max memory per thread; suffix K/M/G recognized [768M]
Does the samtools sort maximum memory parameter (m- ) require / accept values given in bytes, kB, MB of GB?
sort answer: no
$grep "max_mem" bam_sort.c
size_t max_mem = 500000000;
case 'm': max_mem = atol(optarg); break;
but you can always use a bash
inline multiplication:
samtools sort -m $((10*1024)) ....
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
ah you're right! The version I tested at home was the 'old' 1.18
Nice! I'll certainly try this. What effect does the compression level have? Sam format is not compressed? Does is mean compression to keep the memory use down?
the records stored in memory are uncompressed.
What is the option -l for then?
It is used when samtools is writing the data (not in memory) https://github.com/samtools/samtools/blob/develop/bam_sort.c#L500
What is the advantage / disadvantage of using it? If the data is not compressed in memory and also the output data is not compressed?
non compressed in memory: the sorting algorithm goes faster because it doesn't need to compress/uncompress each item. output data compressed=less storage needed. output data uncompressed=faster if a downstream program reads the data in a pipeline.