I am trying to use the bedtools window command to obtain counts of the number of variants in each window of the hg19 human vcf files. Here is the command:
bedtools window -a 50kb.bed -b chr1.vcf.gz -c > coverage.txt
This results in the following error:
However, the command works fine on some of the smaller chromosomes (e.g. chr19) without the error occuring. What is causing this error and how can I stop it from happening?
Issue related to RAM or, more likely, available disk space. Instead of crashing your operating system, the shell kills off the process with signal 9.
On which OS are you running this? If linux/UNIX, is it being run on a shared system?
I am running this through the Linux terminal on a Mac. Is this an issue with bedtools then? I have worked on these same large files with other tools (bcftools, vcftools etc) with no issues. Does bedtools unzip the file before working on it?
Yes I assume that it unpacks it into RAM and then performs the operation. As your chr1 is 30GB unpacked, though, you will require > 30GB RAM. It may actually work if you unpack it to the hard-disk first, and then re-run the bedtools command.
There are probably other fancy ways of doing this to avoid excessive memory usage.