Hi, currently, I am trying to calculate the coverage for my genes of interest (i.e. number of reads falling in or overlapping by the gene of interest) from DNA-seq data in a BAM file. Genes of interest are annotated in a BED file.
There are several answers on Biostar suggesting to use
bedtools coverage e.g.,
bedtools coverage -abam sample.bam -b exons.bed -counts. However, while checking the Samtools' manual, I stumbled upon
samtools bedcov with the following description:
read depth per BED region. Reports read depth per genomic region, as specified in the supplied BED file.
which can be used like this:
samtools bedcov gene.bed sample.bam.
I have tested these two and I am getting different results! For instance, for
OR4F5gene I get the following results (using the same BAM file for both):
chr1 69091 70008 "OR4F5" 61 #from bedtools coverage
chr1 69091 70008 "OR4F5" 4714 #from samtools bedcov
61 is very different than
Another observation is that the result from
Samtools is instantaneous for the above test while
Bedtools takes a lot of time to produce the result. The speed itself has led me to think of using
samtools bedcov. However, I was wondering whether there is catch that I am overlooking! I tried to find more information about
samtools bedcov but I was not able to find anything more than that one-liner description.
I would be happy to hear about your feedback on this.
[EDIT]: This is how it looks like in IGV. It seems that
Bedtools results is closer to reality.