I've used samtools mpileup and bcftools call on a few occasions to call variants from multiple BAM files. I've never had issues with the process until now. I am calling variants from 26 regions to try and speed up the genome wide variant calling in 25 samples (25 BAM files). I keep getting a memory leak error. So far, the jobs reporting the memory leak error have completed and generated a VCF file, but they consume way more memory than expected. Why is this happening? Are the resulting VCF files trustworthy?
My command is this:
samtools mpileup -ug -l refctgs.region9.bed -f WB_2.0.fa -b bam_files.txt | bcftools call -vmO z -o bbub.refctgs.region9.vcf.gz
And the error:
[mpileup] 25 samples in 25 input files
<mpileup> Set max per-file depth to 320
[bam_plp_destroy] memory leak: 12. Continue anyway.
[bam_plp_destroy] memory leak: 31. Continue anyway.
[bam_plp_destroy] memory leak: 18. Continue anyway.
[bam_plp_destroy] memory leak: 21. Continue anyway.
[bam_plp_destroy] memory leak: 17. Continue anyway.
[bam_plp_destroy] memory leak: 18. Continue anyway.
[bam_plp_destroy] memory leak: 22. Continue anyway.
[bam_plp_destroy] memory leak: 20. Continue anyway.
[bam_plp_destroy] memory leak: 36. Continue anyway.
[bam_plp_destroy] memory leak: 16. Continue anyway.
[bam_plp_destroy] memory leak: 22. Continue anyway.
[bam_plp_destroy] memory leak: 28. Continue anyway.
[bam_plp_destroy] memory leak: 20. Continue anyway.
[bam_plp_destroy] memory leak: 31. Continue anyway.
[bam_plp_destroy] memory leak: 24. Continue anyway.
[bam_plp_destroy] memory leak: 22. Continue anyway.
[bam_plp_destroy] memory leak: 15. Continue anyway.
[bam_plp_destroy] memory leak: 26. Continue anyway.
[bam_plp_destroy] memory leak: 14. Continue anyway.
[bam_plp_destroy] memory leak: 21. Continue anyway.
[bam_plp_destroy] memory leak: 20. Continue anyway.
[bam_plp_destroy] memory leak: 20. Continue anyway.
[bam_plp_destroy] memory leak: 11. Continue anyway.
[bam_plp_destroy] memory leak: 18. Continue anyway.
[bam_plp_destroy] memory leak: 9. Continue anyway.
Thanks for the answer and reference. And yes, there are a lot of region of in my BED file because my reference genome is in about 300,000 pieces! Time to update to 1.3.
Alas, that was a bit ambiguous. The leak is still present in 1.3, as we noticed it rather late in the piece and didn't want to destabilise things. The fix was made soon after the 1.3 release, so the leak is fixed in current develop in the (htslib) repository and will be fixed in the future 1.4 release.
But time to update to 1.3 anyway :-)
I got the same message, and in my case the output file was wrong, though whether to blame samtools mpileup or me isn't clear ;-). I did a pileup of 2 BAM files, one for a parent strain aligned to the standard reference (Plasmodium falciparum), and one for a manipulated strain with a deletion and insertion designed to knockout a gene. The manipulated strain had been aligned to a genome which was the reference plus a snippet of hg19 reference corresponding to the expected insert. Because the list of chromosomes in the 2 references didn't agree I got 2 [bam_plp_destroy] memory leak: ... Continue anyway. errors AND my pileup file only had 2 chromosomes, instead of 16 (14 + mito + apico P. falciparum) or 17 (+ hg19 snippet) So it is not necessarily true that if the job didn't abort the VCF files are unaffected.
If the lists of chromosomes in the headers of the two BAM files differ, probably you are running into issue 306. To be sure, I would need to see the headers and what chromosomes appeared in your output file.
However this is a separate issue to this memory leak and warning message, which themselves remain benign.