Question: metagenomic assembly coverage, for multiple samples
0
gravatar for willnotburn
2.8 years ago by
willnotburn40
United States, Michigan State Universtiy
willnotburn40 wrote:

I have multiple samples (interleaved reads), which were co-assembled into one final.contigs.fa assembly. The downstream goal is analysis of gene distribution among the samples, multivariate stats etc. To do that, the first step is to map reads from each sample back onto final.contigs.fa with bowtie2. I did that and got sam files, which I converted to sorted bam files. Now, I am trying to determine coverage. Questions:

Q1: Assembly coverage. My friend asks: what's your coverage? He means that as an assembly quality measure, and an easy number, like 30X. This post explores tools to get such a number from mpileup results.

So, do I just concatenate all my bam files and run samtools mpileup concatenated.bam...or maybe samtools mpileup *.bam? Please help me out.

Q2: Per-sample coverage. Following up on this old post, is there a difference between

samtools mpileup (options) sample1.bam sample2.bam sample3.bam

and

samtools mpileup (options) sample1.bam
samtools mpileup (options) sample2.bam
samtools mpileup (options) sample3.bam

in a way coverage is calculated (linked OP asked about variant calling).

Lastly, any opinions on what is "good coverage"? For example, if each sample has 5X-10X coverage, is that good enough?

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by willnotburn40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1995 users visited in the last hour
_