Question

fastqc - typical for Illumina seq?

0

Entering edit mode

17 months ago

beginner123 • 0

Hi,

I read everywhere that you can't just fully rely on the fastqc report while looking at your reads and that this depends on the experiment you are working with. Well in my case the reads were generated using IlluminaSeq. What do I expect as normal when looking at the fastqc.html?

I've asked this question here and I got this link as tip: https://sequencing.qcfail.com/software/fastqc/ But It still doesn't answer my question.

This is what I get:

enter image description here

illuminaseq fastqc • 1.2k views

ADD COMMENT • link 16 months ago by beginner123 • 0

0

Entering edit mode

it says that it passed. But my professor told me that the ranges shouldn't be like that? Also if you go to: https://www.illumina.com/science/technology/next-generation-sequencing/plan-experiments/quality-scores.html

it says that "Lower Q scores can result in a significant portion of the reads being unusable. They may also lead to increased false-positive variant calls, resulting in inaccurate conclusions."

ADD REPLY • link 17 months ago by beginner123 • 0

0

Entering edit mode

You asked this question is a different way a few days back: FASTQC.html: Quality control of reads.

But my professor told me that the ranges shouldn't be like that?

If it is a matter of visually having everything above a certain Q score then trim your data using that score cutoff. Real life data can be worse looking than this as far as Q scores go and still works fine.

ADD REPLY • link 17 months ago by GenoMax 141k

0

Entering edit mode

perfect, thanks!

ADD REPLY • link 16 months ago by beginner123 • 0

0

Entering edit mode

What kind of data is this? RNAseq?

Is it an 'old' dataset or something that has been generated recently?

ADD REPLY • link 17 months ago by lieven.sterck 15k

0

Entering edit mode

Yes it is RNA seq data. (Illumina)

ADD REPLY • link 17 months ago by beginner123 • 0

2

Entering edit mode

The general trend you see on that plot--lower quality scores on the read ends, higher quality in the middle of the read--is pretty typical. Your average quality scores look decent, but the variance in quality score near the end of the read is pretty high. I don't usually see the the interquartile range of the quality scores dip so low these days.

I would definitely recommend doing some read trimming/filtering and re-assessing the QC metrics after this step. A tool like fastp can give you good "before and after" QC results.

ADD REPLY • link 17 months ago by Dave Carlson ★ 1.7k

score 0 · Answer 1 · 2022-11-23

Well in my case the reads were generated using IlluminaSeq. What do I expect as normal when looking at the fastqc.html?

Well, for short reads like those you used here (~75nt), I would expect better quality in the 3p end (~28 or higher) because it is what you expect when everything runs smoothly:

RNA quality (RIN>=8, 260/280 ~2, etc)
Concentration
library prep controls
Sequencing performance

Nonetheless, if the QC doesn't look ok to you, you can always trim/filter your sequences.

They may also lead to increased false-positive variant calls, resulting in inaccurate conclusions.

Not necessarily, it depends on many variables, including the coverage.