Question: What Is The Difference Between Sequencing Depth And Coverage
gravatar for User 6659
9.6 years ago by
User 6659970
User 6659970 wrote:


Please could you clarify for me the different metrics associated with sequencing data. What is the difference between depth and coverage.

I have seen a question with this title closed already because it is was classed as an exact duplicate to this question. I have read that question and thought I understood the answer. However i have just seen another question on this forum here which leads me to think that depth and coverage are different things as the people posing the question have given different quality metrics for depth and coverage

e.g. depth (ex min. 50X)
e.g. coverage (ex min 90% per sample/per region)
e.g. quality score (ex. 90% of all sites [depth > 50X and Q20])

so in order to stop this question being closed as a duplicate, perhaps my question should be phrased as, please can you explain the terms depth/coverage/quality score given in the question here

Thanks a lot

read coverage sequencing • 41k views
ADD COMMENTlink modified 9.6 years ago by Michael Dondrup47k • written 9.6 years ago by User 6659970

Hummm... Another definition question! I think we should set a standard answer for this one. It will certainly reaper sooner or later. Where's the wiki?

ADD REPLYlink written 9.6 years ago by Jarretinha3.3k

+1 For the idea to document standard questions and answers on a wiki. And it would be great to make slides based on that as well.

ADD REPLYlink written 9.6 years ago by Chris Evelo10k
gravatar for Michael Dondrup
9.6 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

There is no (well defined) difference, see here What Is The Sequencing 'Depth' ? . My impression is that they are often used synonymously.

The definition has to be inferred from the use of the terms in the literature. I often see the terms combined as "depth of coverage" e.g.:

This repeated sequencing is known as genome "depth of coverage."

Or in two different ways here (

Sensitivity of this technology depends on the depth of the sequencing run (i.e. the number of mapped sequence tags), the size of the genome and the distribution of the target factor. The sequencing depth is directly correlated with cost.

In this article, depth is used to refer to the whole genome, while coverage seems to be used for particular loci. Like in (made up examples) "the genome was sequenced with depth of 10X" vs "coverage of the xyZ gene was low".

ADD COMMENTlink modified 13 months ago by RamRS30k • written 9.6 years ago by Michael Dondrup47k

Yes, I think that's what was meant, 90% of all sample bases are at least covered once, another use. Some regions are hard to sequence, but it was just an example in that question. Another different use of Coverage in the LASTZ alignment tool documentation "Coverage is the fraction of bases in the entire input sequence (target or query, whichever is shorter) that are included in the alignment block, expressed as a percentage." Again, something very different. In the end language is fuzzy. If we cover 80% of a communication with others intellectually, that I would already consider good depth ;)

ADD REPLYlink written 9.6 years ago by Michael Dondrup47k

Please note that at an average coverage of 2 times the area that is not covered at all is over 10% even for simple statistical reasons (so not taking into account that some areas are really easier to sequence than others).

ADD REPLYlink written 9.6 years ago by Chris Evelo10k

thanks - what do you think they mean in the question i referred to by coverage = min 90% per sample/per region

ADD REPLYlink written 9.6 years ago by User 6659970

thanks. What do you think a coverage of 90% per sample per region means as per the question i linked to? Does that mean that 90% of any particular sequence region is sequenced to any depth? That would suggest 10% of a region isn't sequenced at all which doesnt sound likely

ADD REPLYlink written 9.6 years ago by User 6659970

thanks for answering. if 90% per sample per region are covered at least once and 10% of a sample are not covered then the length of the region in question surely has a huge impact on that? How is region length taken into consideration? Is there a certain coverage that ensures 100% of the genome is sequenced at least once or is that no possible due to hard-to-sequence regions?

ADD REPLYlink written 9.6 years ago by User 6659970

In theory only infinite coverage ensures 100% of base sequenced at least once with probability 1. In practice for larger genomes that means it is impossible by shotgun sequencing alone.

ADD REPLYlink written 8.5 years ago by Michael Dondrup47k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1712 users visited in the last hour