Is there a tool that quantifies “spatial coverage”, that is what percentage of a reference assembly has a read mapped to it?
0
0
Entering edit mode
20 months ago
O.rka ▴ 710

I’m not sure if this is the appropriate term but the only way I can think of doing this is converting a bam file to bed file then, making an array of length N where N is the size of the genome, then adding up all the positions, then getting the ratio of nonzero events. Sounds very memory intensive so I’m wondering if there’s a better way.

I have the following files:

  • BAM files of reads mapped to a metagenome of contigs from different metagenome-assembled genomes (MAG)
  • A table of identifiers [id_contig]<tab>[id_mag]
  • A fasta file with all of the contigs

I see that there is samtools coverage but I don't how to get coverage for only certain contigs in the bam file. I also found bedtools genomeCov but it's a little confusing how I can adapt my data.

What I'm ultimately looking for is the following table:

             [mag_1] [mag_2] ... [mag_m]
[bam_file_1] 
[bam_file_2] 
...
[bam_file_n]

Where each value the matrix has the percent of genome covered by reads in the bam file.

coverage genomics mapping assembly • 668 views
ADD COMMENT
0
Entering edit mode

You could do samtools coverage with

-r, --region REG        show specified region. Format: chr:start-end. 
ADD REPLY

Login before adding your answer.

Traffic: 2727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6