Question: Rna-Seq: Difference In Read Quality Pattern Between Illumina Ga And Hiseq 2000?
gravatar for Bio_X2Y
8.6 years ago by
Bio_X2Y3.8k wrote:

In the past, we've used an Illumina GA for our RNA-seq experiments. In general, we noticed that the reported quality of the read bases was highest at the 5' end of each read, and the quality dropped gradually towards the 3' end (as per the FASTQ files). This is what we expected.

Recently, however, we've received an RNA-seq dataset generated from a HiSeq 2000, and notice a different pattern. The 5' bases have a high quality, but the quality actually improves in the 3' direction until about base 20 (out of 90), and then drops gradually.

Can someone perhaps comment on whether this alternative pattern is just a harmless artifact of the HiSeq 2000, or if it should be a cause for concern?


illumina quality rna hiseq • 3.0k views
ADD COMMENTlink written 8.6 years ago by Bio_X2Y3.8k

Just wanted to add that we've also seen the same pattern -- something like a upside-down-smile (aka. a frown), where something like bases 1-4, 5-9, 10-14 increase in a step-like fashion, then a "normal" phred like distro is seen where we have a gradual/slight decrease in scores towards the 3' direction. We're doing 50 bp runs, and the median score out at base 50 is still ~ 36 (out of 40), so ... all in all, it's still quite good for us.

ADD REPLYlink written 8.6 years ago by Steve Lianoglou5.0k

@steve: We also see similar pattern; 1-3, 4-8, 9-10, increase stepwise, then gradual increase upto 50-60bp and then slowly decreases till 3' end. we are running 104bp. but over all read qualities are good (median scores >32).

ADD REPLYlink written 8.6 years ago by Rm7.9k
gravatar for Brad Chapman
8.6 years ago by
Brad Chapman9.5k
Boston, MA
Brad Chapman9.5k wrote:

Illumina changed the quality prediction in HCS 1.4 (RTA 1.12) to better model error rates at the 5' ends of the sequence. This tech note describes the change (I couldn't find it on the Illumina website, so the link is to my Dropbox):

Page 11 of the RTA Theory of Operations tech note has additional useful details:

So the new software is attempting to better model the underlying error rates, as opposed to a fundamental change in 5' sequence quality on the Hi-Seq.

ADD COMMENTlink modified 8.6 years ago • written 8.6 years ago by Brad Chapman9.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1647 users visited in the last hour