Is there a tool that quantifies “spatial coverage”, that is what percentage of a reference assembly has a read mapped to it?
7 weeks ago
I’m not sure if this is the appropriate term but the only way I can think of doing this is converting a bam file to bed file then, making an array of length N where N is the size of the genome, then adding up all the positions, then getting the ratio of nonzero events. Sounds very memory intensive so I’m wondering if there’s a better way.

I have the following files:

  • BAM files of reads mapped to a metagenome of contigs from different metagenome-assembled genomes (MAG)
  • A table of identifiers [id_contig]<tab>[id_mag]
  • A fasta file with all of the contigs

I see that there is samtools coverage but I don't how to get coverage for only certain contigs in the bam file. I also found bedtools genomeCov but it's a little confusing how I can adapt my data.

What I'm ultimately looking for is the following table:

             [mag_1] [mag_2] ... [mag_m]

Where each value the matrix has the percent of genome covered by reads in the bam file.

You could do samtools coverage with

-r, --region REG        show specified region. Format: chr:start-end. 

