What Are The Metrics To Determine The Quality Of A Whole Genome Sequence
Entering edit mode
12.5 years ago
Biomed 4.9k

Hi, I would like to generate a set of metrics to be able to evaluate the general quality of a whole genome sequence before I can start analyzing it with a reasonable confidence that the variation I am after is in the haystack. I know there are tools like fastqc that generate reports but without knowing pretty well what you should expect the tools are less effective. I know there is not a single criteria and everyone has their own list of things but I think there are some common criteria that most people could aggree on.

example GC% should be between 40-50 or total number of reads should be >3 million etc. Thanks

genome next-gen sequencing quality • 2.8k views
Entering edit mode
12.5 years ago
Pablo ★ 1.9k

At FastQC's page http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

You'll find that there are a couple of examples of a 'good' and a 'bad' quality runs.

You are right that there are no well defined thresholds for saying when a run has gone 'bad'. I think it the answer here is that it very dependent on what kind of analysis you planning to do downstream.

Edit: The main criteria I use is that if the quality plot goes below 25 very fast, then it's time to start trimming (or re-sequencing). The GC criteria doesn't apply always (e.g. some plasmodiums are very AT rich and for sure doesn't apply for bisulfite sequencing).


Login before adding your answer.

Traffic: 1365 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6