Is there a tool that quantifies “spatial coverage”, that is what percentage of a reference assembly has a read mapped to it?
0
0
Entering edit mode
7 weeks ago
O.rka ▴ 620

I’m not sure if this is the appropriate term but the only way I can think of doing this is converting a bam file to bed file then, making an array of length N where N is the size of the genome, then adding up all the positions, then getting the ratio of nonzero events. Sounds very memory intensive so I’m wondering if there’s a better way.

I have the following files:

• BAM files of reads mapped to a metagenome of contigs from different metagenome-assembled genomes (MAG)
• A table of identifiers [id_contig]<tab>[id_mag]
• A fasta file with all of the contigs

I see that there is samtools coverage but I don't how to get coverage for only certain contigs in the bam file. I also found bedtools genomeCov but it's a little confusing how I can adapt my data.

What I'm ultimately looking for is the following table:

             [mag_1] [mag_2] ... [mag_m]
[bam_file_1]
[bam_file_2]
...
[bam_file_n]


Where each value the matrix has the percent of genome covered by reads in the bam file.

coverage genomics mapping assembly • 376 views
0
Entering edit mode

You could do samtools coverage with

-r, --region REG        show specified region. Format: chr:start-end.