bam_sort_core problem when bam files be processed by samtools
1
1
Entering edit mode
5.5 years ago
1106518271 ▴ 60

I sorted my RNA-Seq bam files after mapping by star, and here is my key command line in a for cycle bash file:

samtools sort $file -o ${filename}.sorted.bam

At last, the one error file shows like:

[bam_sort_core] merging from 6 files and 1 in-memory blocks...
[bam_sort_core] merging from 4 files and 1 in-memory blocks...
[bam_sort_core] merging from 8 files and 1 in-memory blocks...
[bam_sort_core] merging from 9 files and 1 in-memory blocks...
[bam_sort_core] merging from 7 files and 1 in-memory blocks...
[bam_sort_core] merging from 4 files and 1 in-memory blocks...
[bam_sort_core] merging from 5 files and 1 in-memory blocks...
[bam_sort_core] merging from 3 files and 1 in-memory blocks...
[bam_sort_core] merging from 6 files and 1 in-memory blocks...
[bam_sort_core] merging from 13 files and 1 in-memory blocks...
[bam_sort_core] merging from 8 files and 1 in-memory blocks...
[bam_sort_core] merging from 2 files and 1 in-memory blocks...
...

Something wrong here? I searched but still don't know why, so I'm not sure these sorted files be used for next step?
What's more, I have M(like 63) files, and there M-N(like 6, yes, few) lines, it means not all file will meet this problem? It makes me more confused.

Any ideas will be appreciated!!!

rna-seq samtools • 4.8k views
ADD COMMENT
2
Entering edit mode

If you have the memory, you can reduce the number of temporary files by increasing the default memory usage from 768Mb to, say, 2G using the -m option, e.g. samtools sort -m 2G -o out.bam in.bam. Be sure to never use something like -m 2 rather than -m 2G as this would set the memory limit to 2 bytes resulting in thousands of tmp files, eventually crashing the system.

ADD REPLY
1
Entering edit mode

Got it, many thanks!

ADD REPLY
0
Entering edit mode

so is it really an error, and anyone knows to set -@ and -m, which is more useful when dealing with hundreds of bams at the same time in cluster, thanks a lot

ADD REPLY
0
Entering edit mode

Please use Add Comment for comments. As Istvan explained, these are just status messages, neither errors nor warnings. Set -@ and -m as you like, but these are options that still deal with one file at a time. If you want things parallelized, have a look at GNU parallel, like:

find ./ -maxdepth 1 -name "*.bam" | parallel -j 8 "samtools sort -@ 2 -m 2G -o {.}_sorted.bam {}

This command will sort all BAM files in your current directories, 8 at a time with 2 cores and 2GB of memory per core each.

ADD REPLY
0
Entering edit mode

By the way, will this "error" here lead to give wrong result(${filename}.sorted.bam)?

ADD REPLY
8
Entering edit mode
5.5 years ago

These are not error messages, just debugging notes.

Large files cannot be sorted in memory thus get saved into temporary files. Once the sort completes the temporary files are removed.

There is nothing to be concerned about

ADD COMMENT

Login before adding your answer.

Traffic: 2966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6