samtools mpileup memory leakage
1
0
Entering edit mode
8.3 years ago
lkw222 ▴ 30

I've used samtools mpileup and bcftools call on a few occasions to call variants from multiple BAM files. I've never had issues with the process until now. I am calling variants from 26 regions to try and speed up the genome wide variant calling in 25 samples (25 BAM files). I keep getting a memory leak error. So far, the jobs reporting the memory leak error have completed and generated a VCF file, but they consume way more memory than expected. Why is this happening? Are the resulting VCF files trustworthy?

My command is this:

samtools mpileup -ug -l refctgs.region9.bed -f WB_2.0.fa -b bam_files.txt | bcftools call -vmO z -o bbub.refctgs.region9.vcf.gz

And the error:

[mpileup] 25 samples in 25 input files

<mpileup> Set max per-file depth to 320

[bam_plp_destroy] memory leak: 12. Continue anyway.

[bam_plp_destroy] memory leak: 31. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 21. Continue anyway.

[bam_plp_destroy] memory leak: 17. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 36. Continue anyway.

[bam_plp_destroy] memory leak: 16. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 28. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 31. Continue anyway.

[bam_plp_destroy] memory leak: 24. Continue anyway.

[bam_plp_destroy] memory leak: 22. Continue anyway.

[bam_plp_destroy] memory leak: 15. Continue anyway.

[bam_plp_destroy] memory leak: 26. Continue anyway.

[bam_plp_destroy] memory leak: 14. Continue anyway.

[bam_plp_destroy] memory leak: 21. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 20. Continue anyway.

[bam_plp_destroy] memory leak: 11. Continue anyway.

[bam_plp_destroy] memory leak: 18. Continue anyway.

[bam_plp_destroy] memory leak: 9. Continue anyway.

samtools mpileup variant calling bcftools memory • 4.0k views
ADD COMMENT
2
Entering edit mode
8.3 years ago

Your VCF files are entirely unaffected (as long as the leak doesn't cause the job to abort, which you said isn't happening). This is just a warning about a memory leak, not an error.

This is a longstanding minor mpileup bug, which occurs when the code breaks out of a pileup loop before its natural end (i.e., the end of a chromosome), in this case I guess due to the regions you've specified with the -l option. Usually for us it's just caused a trivial memory leak, but I guess there's a lot of regions in your BED file.

It's been fixed after the 1.3 release. See issue #299.

ADD COMMENT
0
Entering edit mode

Thanks for the answer and reference. And yes, there are a lot of region of in my BED file because my reference genome is in about 300,000 pieces! Time to update to 1.3.

ADD REPLY
0
Entering edit mode

Alas, that was a bit ambiguous. The leak is still present in 1.3, as we noticed it rather late in the piece and didn't want to destabilise things. The fix was made soon after the 1.3 release, so the leak is fixed in current develop in the (htslib) repository and will be fixed in the future 1.4 release.

But time to update to 1.3 anyway :-)

ADD REPLY
0
Entering edit mode

I got the same message, and in my case the output file was wrong, though whether to blame samtools mpileup or me isn't clear ;-). I did a pileup of 2 BAM files, one for a parent strain aligned to the standard reference (Plasmodium falciparum), and one for a manipulated strain with a deletion and insertion designed to knockout a gene. The manipulated strain had been aligned to a genome which was the reference plus a snippet of hg19 reference corresponding to the expected insert. Because the list of chromosomes in the 2 references didn't agree I got 2 [bam_plp_destroy] memory leak: ... Continue anyway. errors AND my pileup file only had 2 chromosomes, instead of 16 (14 + mito + apico P. falciparum) or 17 (+ hg19 snippet) So it is not necessarily true that if the job didn't abort the VCF files are unaffected.

ADD REPLY
0
Entering edit mode

If the lists of chromosomes in the headers of the two BAM files differ, probably you are running into issue 306. To be sure, I would need to see the headers and what chromosomes appeared in your output file.

However this is a separate issue to this memory leak and warning message, which themselves remain benign.

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6