Question: What is the definition of "read depth" vs "coverage"? (again...)
2
gravatar for ariel.balter
23 months ago by
ariel.balter140
ariel.balter140 wrote:

There are a number of biostars posts on how to calculate coverage and read depth, and what they mean. I'm still confused.

This is how I currently understand things:

Depth (or Read Depth) at a BP coordinate

The number of "hits" on that coordinate resulting from alignment. In other words, the number of aligned reads that land on that coordinate. The height of the bar over that position in a genome browser.

Coverage

The sum of all the depths across a particular coordinate range such as a gene or a peak.

Total Read Depth

The sum of all the depths across a particular coordinate range such as a gene or a peak. This is NOT exactly the same as

[read length] × [num reads, i.e. fastq lines ÷ 4] ÷ [length of reference in bp]

because not all reads in the fastq will get aligned or completely aligned.

Total Coverage

Same as Total Read Depth

Average Read Depth

[Total Read Depth] ÷ [number of base pairs (coordinates) in aligned region or reference]

Example:

For example, I want to calculate the differential binding between chipseq samples. This is roughly

[coverage under peak in treatment 1] ÷ [coverage under peak in treatment 2]

However, I should really normalize by the average read depth of each treatment. So

[coverage under peak in treatment 1 ÷ average read depth in treatment 1 bam]
÷
[coverage under peak in treatment 2 ÷ average read depth in treatment 2 bam]

Is this correct?

depth coverage read depth • 4.0k views
ADD COMMENTlink modified 23 months ago • written 23 months ago by ariel.balter140
1

There are no widely accepted definitions. When you see "depth" or "coverage" mentioned in a paper, you almost always need to figure out its exact meaning from the context.

ADD REPLYlink written 23 months ago by lh331k
2

NOOOOO! That is exactly what I did not want to hear :(

ADD REPLYlink written 23 months ago by ariel.balter140
2

I typically use the words like this:

The depth of an experiment is the total number of reads obtained from the sequencer. More stringend, you can also say that the effective depth is the number of reads that remains after all the filtering, so duplicate reads, low-quality, proper pairs for paired-end sequencing etc. Therefore, sequencing deeper means to sequence your samples on a larger flow cell or a platform that can produce more reads, e.g. Nextseq instead of Miseq.

The coverage is the number of reads that covers a certain region or nucleotide.

Still, as Heng pointed out, there is no real uniform definiton, so whenever you discuss or give a talk, reserve one slide to explain the vocabulary to make sure that everyone is on the same page and can follow properly.

ADD REPLYlink written 23 months ago by ATpoint15k

This one should probably be changed to be a forum post rather than a question.

ADD REPLYlink written 23 months ago by Lars Juhl Jensen11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1472 users visited in the last hour