Question

Fastqc 'Chip-Seq' Quality Score Reverse Pattern [Quality Increasing At Ends]

1

Entering edit mode

13.4 years ago

Sukhi Singh 11k

Hola! I am encountering a strange problem. My fastqc graph is like this, with quality score increasing at the end, but we should observe is a decrease at the end. alt text

The fastq files are generated using CASAVA-1.8.0 , so the format is supposed to be sanger encoded.

My previous graphs from different experiment they show a decrease in the end as opposed to this one.

Why I am observing this pattern (increase in quality scores at the end)?

Thanks for your comments.

chip-seq fastqc • 5.0k views

ADD COMMENT • link 13.4 years ago by Sukhi Singh 11k

0

Entering edit mode

What is your question?

ADD REPLY • link 13.4 years ago by Niek De Klein ★ 2.6k

0

Entering edit mode

Why I am observing this pattern (increase in quality scores at the end)?

ADD REPLY • link 13.4 years ago by Sukhi Singh 11k

0

Entering edit mode

Ever since we've switched to running our samples on a HiSeq machine[*], all of our phred distros exhibit this exact same pattern, and I'd have to say: judging by this quality distro plot alone, your data actually looks pretty great.

[*]I'm not sure if it was the switch to the HiSeq, or the upgraded software/chemistry -- maybe GAIIx runs look like this now, too ... I wouldn't know, though.

ADD REPLY • link 13.4 years ago by Steve Lianoglou 5.2k

0

Entering edit mode

The modeling of error rates changed in recent versions of the Illumina software. See this question for more details: http://biostar.stackexchange.com/questions/12150/rna-seq-difference-in-read-quality-pattern-between-illumina-ga-and-hiseq-2000/12179#12179

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 13.4 years ago by Brad Chapman 9.7k

0

Entering edit mode

@Steve @Brad Thanks for your comments, I think the graph is fine, its just the change in error model by Illumina

ADD REPLY • link 13.4 years ago by Sukhi Singh 11k

score 1 · Answer 1 · 2012-02-20

1

Entering edit mode

13.4 years ago

Istvan Albert 102k

I wouldn't read too much into it. As you can see your error bars are actually increasing. Remember that these quality measures are unreliable approximations and should not be taken overly seriously.

There are only two values really - good (keep) and bad (reject), with a region in between that is a tossup.

ADD COMMENT • link 13.4 years ago by Istvan Albert 102k

0

Entering edit mode

Thanks Istvan, makes sense. Either q>25 or q<10 Cheers

ADD REPLY • link 13.2 years ago by Sukhi Singh 11k

score 1 · Answer 2 · 2012-02-22

As pointed out correctly by Brad, it is due to the change from five-parameter quality model to six-parameter quality model.

From the tech note

"Why did we move to the 6-predictor model? Although the 5-predictor model was very good at predicting quality, the 6-predictor model is more accurate and enables us to accurately predict the high percentage of Q40 data that was missed with the 5-predictor model. The new model is also faster and provides Quality scores after around cycle 11 in read 2 of paired-end reads (compared to around cycle 25 with the previous model)."cycle 25 with the previous model)."

Read it here http://dl.dropbox.com/u/6634542/RTA_Quality_Predictors_TechNote.pdf

So this graph is correct and makes sense now.

Sukhi