Question: samtools mpileup memory leakage
0
gravatar for lkw222
3.9 years ago by
lkw22230
lkw22230 wrote:

I've used samtools mpileup and bcftools call on a few occasions to call variants from multiple BAM files. I've never had issues with the process until now. I am calling variants from 26 regions to try and speed up the genome wide variant calling in 25 samples (25 BAM files). I keep getting a memory leak error. So far, the jobs reporting the memory leak error have completed and generated a VCF file, but they consume way more memory than expected. Why is this happening? Are the resulting VCF files trustworthy?

My command is this:

samtools mpileup -ug -l refctgs.region9.bed -f WB_2.0.fa -b bam_files.txt | bcftools call -vmO z -o bbub.refctgs.region9.vcf.gz

And the error:

[mpileup] 25 samples in 25 input files

<mpileup> Set max per-file depth to 320

[bam_plp_destroy] memory leak: 12. Continue anyway.

[bam_plp_destroy] memory leak: 31. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 21. Continue anyway.

[bam_plp_destroy] memory leak: 17. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 36. Continue anyway.

[bam_plp_destroy] memory leak: 16. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 28. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 31. Continue anyway.

[bam_plp_destroy] memory leak: 24. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 15. Continue anyway.

[bam_plp_destroy] memory leak: 26. Continue anyway.

[bam_plp_destroy] memory leak: 14. Continue anyway.

[bam_plp_destroy] memory leak: 21. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 11. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 9. Continue anyway.

ADD COMMENTlink modified 3.9 years ago by John Marshall1.9k • written 3.9 years ago by lkw22230
2
gravatar for John Marshall
3.9 years ago by
John Marshall1.9k
Glasgow, Scotland
John Marshall1.9k wrote:

Your VCF files are entirely unaffected (as long as the leak doesn't cause the job to abort, which you said isn't happening).  This is just a warning about a memory leak, not an error.

This is a longstanding minor mpileup bug, which occurs when the code breaks out of a pileup loop before its natural end (i.e., the end of a chromosome), in this case I guess due to the regions you've specified with the -l option.  Usually for us it's just caused a trivial memory leak, but I guess there's a lot of regions in your BED file.

It's been fixed after the 1.3 release.  See issue #299.

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by John Marshall1.9k

Thanks for the answer and reference. And yes, there are a lot of region of in my BED file because my reference genome is in about 300,000 pieces! Time to update to 1.3.

ADD REPLYlink written 3.9 years ago by lkw22230

Alas, that was a bit ambiguous.  The leak is still present in 1.3, as we noticed it rather late in the piece and didn't want to destabilise things.  The fix was made soon after the 1.3 release, so the leak is fixed in current develop in the (htslib) repository and will be fixed in the future 1.4 release.

But time to update to 1.3 anyway :-)

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by John Marshall1.9k

I got the same message, and in my case the output file was wrong, though whether to blame samtools mpileup or me isn't clear ;-). I did a pileup of 2 BAM files, one for a parent strain aligned to the standard reference (Plasmodium falciparum), and one for a manipulated strain with a deletion and insertion designed to knockout a gene. The manipulated strain had been aligned to a genome which was the reference plus a snippet of hg19 reference corresponding to the expected insert. Because the list of chromosomes in the 2 references didn't agree I got 2 [bam_plp_destroy] memory leak: ... Continue anyway. errors AND my pileup file only had 2 chromosomes, instead of 16 (14 + mito + apico P. falciparum) or 17 (+ hg19 snippet) So it is not necessarily true that if the job didn't abort the VCF files are unaffected.

ADD REPLYlink written 3.8 years ago by penington.j0

If the lists of chromosomes in the headers of the two BAM files differ, probably you are running into issue 306. To be sure, I would need to see the headers and what chromosomes appeared in your output file.

However this is a separate issue to this memory leak and warning message, which themselves remain benign.

ADD REPLYlink written 3.8 years ago by John Marshall1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 742 users visited in the last hour