I want to calculate the number of reads mapping to regions of the genome, and I want to group these reads into bins of say 1kb. I have used bedtools genomecov to generate a bedgraph file that shows the coverage of reads across the genome.
bedtools genomecov -bg -ibam file.bam >Coverage.bedgraph
I then merged nearby regions of coverage using:
bedtools merge -d 1000 -c 4 -o sum -i Coverage.bedgraph > Coverage.merged.bedgraph
The problem is that the "sum" part adds the coverage from the different regions, so when merged, reads are counted more than once.
Is there a way to merge the regions without counting reads more than once? I think the necessary information from the bam file is probably lost at this point. Is there a way to do this with an option in bedtools genomecov i.e. can you make it report regions of coverage with a specified 'bin' size?