Question: 454 Coverage Quality
gravatar for Pierre Lindenbaum
8.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

I've been given a set of 454 sequences (exome sequencing). The coverage was:

 Mean coverage                          18,93
 1x                                   90,60%
 5x                                   68,09%
 10x                                  49,68%
 20x                                  33,75%
 40x                                  16,28%
 80x                                   0,76%
 100x                                  0,07%

I'm not used to this kind of data , as far as I understand 10% of the bases haven't been covered...

If that was a set of Illumina GA data, I would say that this set was badly covered

But for 454, what should I think of that result ?


(Human Genome, two chromosomes have been sequenced)

             Count         Average Length Total-Bases
Reads        563.561       328,89         185.348.167
Matched      473.235       328,26         155.345.247
Not matched  90.326        332,16          30.002.920
References   2             127.542.357    255.084.714
ADD COMMENTlink modified 4.8 years ago by Biostar ♦♦ 20 • written 8.7 years ago by Pierre Lindenbaum120k

Hi Pierre, here are a few questions to help us assess your data: How many sequences did you get? What's their average length? What is the expected total exome length for your species? Cheers

ADD REPLYlink written 8.7 years ago by Eric Normandeau10k

Eric, thanks, I'll get this information tomorrow morning

ADD REPLYlink written 8.7 years ago by Pierre Lindenbaum120k

It would also help to know more about origins and characteristics of the sample that was sequenced.

ADD REPLYlink written 8.7 years ago by Istvan Albert ♦♦ 80k
gravatar for Istvan Albert
8.7 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

From purely probabilistic point of view an 18x coverage should lead a lot fewer than 10% uncovered bases.

But since you are saying that this data comes from a more novel methodology(that of exon capture) there might be some larger inherent errors to the process. I recall some papers indicating a 96-98% recovery rate of exons in published results. I would expect less successful results in day to day trials. It might just be some exonic regions do not work well in actually capturing data.

ADD COMMENTlink written 8.7 years ago by Istvan Albert ♦♦ 80k
gravatar for Casbon
8.7 years ago by
Casbon3.2k wrote:

Need more info: 454 Titanium or FLX and extraction technique and targets.

You say that you targeted two chromosomes, you meann the whole thing or the exomes? How was the extraction done?

For your mean depth 18, a really naive Poisson model suggests that the zero based coverage should be far less than 10% (probability Poisson is zero with lambda=18), but depends on your extraction technology really.

ADD COMMENTlink written 8.7 years ago by Casbon3.2k
gravatar for Fiamh
8.7 years ago by
Boston, MA
Fiamh220 wrote:

As Istvan pointed out it depends heavily on the extraction approach. The figure I've seen mentioned most consistently is even lower at 80-90% (for example see Dan's recent report from the CSHL meeting,

ADD COMMENTlink written 8.7 years ago by Fiamh220
gravatar for Ketil
8.6 years ago by
Ketil3.9k wrote:

There could also be a problem with how the coverage was measured. E.g., if you're using an aligner (like Blast) which masks low complexity, or if you require unambigous matches (in which case repeats might be missed).

Generally, read coverage isn't as evenly distributed as one would like, so stochastic models like Poisson only works as a rough approximation.

ADD COMMENTlink written 8.6 years ago by Ketil3.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1780 users visited in the last hour