Hello,
I am using samtools idxstats and I'd like to know what is hs37d5. Is that only the decoy sequences or the sum of reference genome + decoy sequences?
I mean, I have different BAM files and I want to have a measure of the "coverage" of each one, I thought about calculating the percentage of mapped reads (= #mapped reads รท sequence length) using samtools idxstats output. Can I use that or is not representative and should I use the #mapped reads from chr1-chrY?
For 'coverage' metrics, you appear to be referring to depth of coverage, which would be better served by using BEDTools coverage or BEDTools genomecov. SAMtools idxstats will give a different type of information, i.e., number of reads aligning to each of your contigs, and is more used for gauging overall alignment metrics.
Thank you!
I haven't done BEDtools genomecov or coverage because in both cases I need my BAM, but also another file : -g in the first case and -a in the second one. I guess that in both cases it must be my reference genome, am I right?
Thank you! I haven't done BEDtools
genomecov
orcoverage
because in both cases I need my BAM, but also another file : -g in the first case and -a in the second one. I guess that in both cases it must be my reference genome, am I right?Yes, take a look here, where even the developer of BEDTools has provide an answer: Bedtools genomecov to identify regions at any coverage