Question: hs37d5 samtools idxstats
Hello, I am using samtools idxstats and I'd like to know what is hs37d5. Is that only the decoy sequences or the sum of reference genome + decoy sequences?

I mean, I have different BAM files and I want to have a measure of the "coverage" of each one, I thought about calculating the percentage of mapped reads (= #mapped reads รท sequence length) using samtools idxstats output. Can I use that or is not representative and should I use the #mapped reads from chr1-chrY?


samtools • 242 views
Quite a few resources from reputable individuals (including our very own Devon) and companies on this:

That should firmly help to explain hs37d5.


For 'coverage' metrics, you appear to be referring to depth of coverage, which would be better served by using BEDTools coverage or BEDTools genomecov. SAMtools idxstats will give a different type of information, i.e., number of reads aligning to each of your contigs, and is more used for gauging overall alignment metrics.


Thank you! I haven't done BEDtools genomecov or coverage because in both cases I need my BAM, but also another file : -g in the first case and -a in the second one. I guess that in both cases it must be my reference genome, am I right?

Yes, take a look here, where even the developer of BEDTools has provide an answer: Bedtools genomecov to identify regions at any coverage

