Question

calculating coverage of gene in my metagenomes with samtools

0

Entering edit mode

9 weeks ago

lintonf • 0

I am interested to see the coverage of a set of genes in my metagenomes (they are unassembled metagenomes).

My plan was to map my raw reads to a database of bacterial gene sequences that I curated (specifically alcohol dehydrogenase aka adhE).

After trimming and removing human-host reads with KneadData, I am using bowtie2 to map my raw reads from the metagenomes to this gene database I've made.

I am trying to determine how many reads map to these genes in my metagenomes (i.e., depth of coverage). I know that samtools depth gives me the number of reads mapped to each position, and I know that samtools coverage gives me how many bases in the reference are covered by reads in my metagenomes.

My question is: if I am interested in how these genes are distributed and their prevalence in my metagenomes, I should use the meandepth column in the output from samtools coverage, correct? Or do I instead sum up the reads by position for each gene from samtools depth, then divide that by the length of the reference gene? I was using the meandepth result from samtools coverage as a metric for gene representation in my metagenomes, but am second guessing myself.

Thank you for your input!

samtools depth metagenome coverage • 411 views

ADD COMMENT • link updated 9 weeks ago by Pierre Lindenbaum 166k • written 9 weeks ago by lintonf • 0

0

Entering edit mode

if you use the same filtering and samtools depth -a there should be no difference isn't it ? samtools coverage would be faster.

ADD REPLY • link 9 weeks ago by Pierre Lindenbaum 166k