Question: How to measure NGS depth coverage bias
0
gravatar for Anand Rao
4 weeks ago by
Anand Rao100
United States
Anand Rao100 wrote:

Is there a software tool that reports a measure of the degree of non-uniformity in depth of Illumina sequencing coverage across a de novo assembled genome (against which the Illumina reads are mapped back)?

I have the PE read library (2150bp HiSeq4000), the *de novo assembled genome, and the BAM file for mapping of former to the latter - and I have 290 such data points. I am curious to know how many of these 290 have more versus less uniform coverage depth across their respective genomes.

I came across a paper - https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-5-r51, but no software tool name per se. Some of my assembly woes may mirror those from an earlier post at Any advice for a de novo genome assembly .

To reiterate: Is there a software (like a supplement to something like BBTool's bbnorm) that can help visualize quickly which of my genome assemblies are built on the basis of more uniform coverage depth?

ADD COMMENTlink modified 29 days ago by Len Trigg900 • written 4 weeks ago by Anand Rao100
2
gravatar for Brian Bushnell
29 days ago by
Walnut Creek, USA
Brian Bushnell14k wrote:

If you map reads to an assembly, you can use BBMap's pileup.sh like this:

pileup.sh in=mapped.sam stats=covstats.txt hist=hist.txt

From the histogram you can visualize the uniformity of the coverage. stats.txt will contain the average coverage and standard deviation on a per-scaffold basis. The program will also print to the screen the overall average coverage and standard deviation.

ADD COMMENTlink modified 28 days ago • written 29 days ago by Brian Bushnell14k
1

You've got

stats=

flag twice, so could you have meant out=covstats.txt?

ADD REPLYlink written 28 days ago by Anand Rao100
1

Fixed, thanks :) It actually doesn't matter (the second stats= overrides the first one). For pileup.sh, covstats, stats, and out are synonymous...

ADD REPLYlink written 28 days ago by Brian Bushnell14k
2
gravatar for Len Trigg
29 days ago by
Len Trigg900
New Zealand
Len Trigg900 wrote:

One measure of non-uniformity of coverage is the fold-80 penalty, (see https://genomebiology.biomedcentral.com/articles/10.1186/gb-2011-12-1-r1). Essentially it is the degree of additional coverage (in fold coverage of the genome) required so that 80% of the target bases will be covered at the current mean coverage.

The rtg coverage command from RTG Core computes the fold-80 penalty, in addition to other statistics and graphs that can be used to visualize coverage distribution information.

ADD COMMENTlink written 29 days ago by Len Trigg900
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1557 users visited in the last hour