I have multiple samples (interleaved reads), which were co-assembled into one
final.contigs.fa assembly. The downstream goal is analysis of gene distribution among the samples, multivariate stats etc. To do that, the first step is to map reads from each sample back onto
bowtie2. I did that and got
sam files, which I converted to sorted
bam files. Now, I am trying to determine coverage. Questions:
Q1: Assembly coverage. My friend asks: what's your coverage? He means that as an assembly quality measure, and an easy number, like 30X. This post explores tools to get such a number from
So, do I just concatenate all my
bam files and run
samtools mpileup concatenated.bam...or maybe
samtools mpileup *.bam? Please help me out.
Q2: Per-sample coverage. Following up on this old post, is there a difference between
samtools mpileup (options) sample1.bam sample2.bam sample3.bam
samtools mpileup (options) sample1.bam samtools mpileup (options) sample2.bam samtools mpileup (options) sample3.bam
in a way coverage is calculated (linked OP asked about variant calling).
Lastly, any opinions on what is "good coverage"? For example, if each sample has 5X-10X coverage, is that good enough?