Question: Average coverage for whole exome sequencing
4.6 years ago by
haiying.kong320 wrote:

We are trying to identify somatic mutations by comparing DNAs of normal and disease tissues. Because of the small size of the disease tissues, the DNA concentration or amount for some samples are very small. We sent 10 samples to a company to try, and 2 out of 10 (each sample generates 2 datasets) data sets failed in some of quality check. We used FastQC for QC. For these samples, we asked average coverage of 60.

The qualities that failed are:

Per base sequence quality

Per base GC content

Per base sequence content

Should we increase the coverage requirement to 90 to have better quality? In general, how much coverage would give decent quality of data? We use Illumina HiSeq platform.


Your figures / links are not working, the correct way to show images on biostars is posting them on a site like imgur and posting the link here.

ADD REPLYlink written 4.6 years ago by h.mon30k

or try to post your figures here

ADD REPLYlink written 4.6 years ago by TriS4.2k
4.6 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

First, average coverage is not a really useful number. You'll want to think in terms of the proportion of bases (in the exome) covered at X (where X is the target coverage). For example, you might aim for 85% of bases covered at 100x or higher coverage.

Now, for somatic sequencing, you'll probably want to aim for significantly higher for the tumor than for the normal sample. Folks routinely sequence to 150X (per base target) and higher to allow one to discover somatic variants in samples with normal tissue admixture, lower variant allele frequency. What you choose will depend on your biological questions, your budget, and your sample purity.

4.6 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.5k wrote:
And some tines up to 500X is required, depending upon the frequency and how much tissue you expect to be mutated in relation to normal tissue
ADD COMMENTlink written 4.6 years ago by Antonio R. Franco4.5k
