How do we measure and report the quality of next generation sequencing?
3
0
Entering edit mode
6.8 years ago
bdolin ▴ 90

Greetings,

Does anyone have guidance on how best to report quality and accuracy of NGS (overall, and for specific regions) for a clinical application?

Thanks

next-gen • 2.8k views
ADD COMMENT
5
Entering edit mode

First step: read How To Ask Good Questions On Technical And Scientific Forums.

Second step: use some general purpose quality-checking software, like FastQC.

Third step: use some targeted quality-checking, it will depend on type of material (RNA, DNA, tissue, cell line, whatever), library prep, sequencing technology, and so on. To narrow it down a bit, mapping metrics would be useful.

ADD REPLY
1
Entering edit mode
6.7 years ago
bdolin ▴ 90

This is so far the best resource I've found on this topic: https://www.fda.gov/downloads/medicaldevices/deviceregulationandguidance/guidancedocuments/ucm509838.pdf

ADD COMMENT
0
Entering edit mode

Thanks for the comments.

Another relevant reference is the FHIR Genomics "Sequence" resource [https://hl7.org/fhir/genomics.html#sequence], which has fields for the quality of NGS data, and is aligning with the FDA's draft recommendations.

So, perhaps through HL7 there is an evolving way to communicate the quality data to an electronic health record in a standard way, and perhaps multiple streams of thought are converging on an agreed upon set of metrics for expressing the quality, but I'm still fuzzy on just where the quality data will come from (e.g. will the lab make it available).

ADD REPLY
0
Entering edit mode
6.7 years ago
Optimist ▴ 180

Hello bdolin,

FastQC serves your purpose very well.

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are

Import of data from BAM, SAM or FastQ files (any variant)
Providing a quick overview to tell you in which areas there may be problems
Summary graphs and tables to quickly assess your data
Export of results to an HTML based permanent report
Offline operation to allow automated generation of reports without running the interactive application

Use the following link for further reading.

cheers

Optimist

ADD COMMENT
0
Entering edit mode

While FastQC is fine for general purpose sequencing QC, following important bit from the original posting needs to be considered.

for a clinical application

I don't think final recommendations for NGS data have been put out by FDA (which may be applicable to US but I assume are widely copied by other regulatory agencies).

Edit: I am not sure if OP added this important bit sometime after @h.mon wrote the comment above but it is there now.

ADD REPLY
0
Entering edit mode

As per the article here on linkedin (https://www.linkedin.com/pulse/evolving-standards-clinical-ngs-akanksha-wattal ), CDC and CAP (both, i guess are in US) require certain quality measures for clinical NGS. Refer to Table 2. But tools accepted are not listed in the article.

ADD REPLY
0
Entering edit mode
6.7 years ago
chen ★ 2.5k

Try AfterQC, it offers:

  • Filters reads with too low quality, too short length or too many N
  • Filters reads with abnormal PolyA/PolyT/PolyC/PolyG sequences
  • Does per-base quality control and plots the figures
  • Trims reads at front and tail, according to QC results
  • For pair-end sequencing data, AfterQC automatically corrects low quality wrong bases in overlapped area of read1/read2
  • Detects and eliminates bubble artifact caused by sequencer due to fluid dynamics issues
  • Single molecule barcode sequencing support: if all reads have a single molecule barcode (see duplex sequencing), AfterQC shifts the barcodes from the reads to the fastq query names
  • Support both single-end sequencing and pair-end sequencing data
  • Automatic adapter cutting for pair-end sequencing data
  • Sequencing error estimation, and error distribution profiling

It is available at: https://github.com/OpenGene/AfterQC

Here is a report sample: http://opengene.org/AfterQC/report.html

ADD COMMENT

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6