Illumina's HiSeq strange Phred quality score
2
0
Entering edit mode
5.6 years ago
abascalfederico ★ 1.2k

Hi,

I have some HiSeq sequencing data with unusual Phred quality scores. The minimum is "!" and the maximum is "K" (0-42). This is not similar to any of the usual schemes: https://en.wikipedia.org/wiki/FASTQ_format

Since I cannot run a given program with this score scheme I guess I have to rescale the current scores to a standard scheme (e.g. Phred+33). Any hint how can I do this? Would it be ok to just replace "Ks" by "Js"?

Thanks!

phred illumina hiseq • 3.3k views
ADD COMMENT
3
Entering edit mode
5.6 years ago
Dan D 7.2k

You likely have data generated on a HiSeq 3000/4000 or X sequencer. K is the highest quality score on the X platform. Otherwise the ASCII offset will be the same as the prior generation of Illumina sequencer output.

ADD COMMENT
3
Entering edit mode
5.6 years ago
DVA ▴ 610

K is not illegal. You can refer to this post: Illumina X ten samples have phred scores out of range [0,41]

You can also find the phred score calculation here: http://drive5.com/usearch/manual/quality_score.html

I don't think you need to rescale your score, unless your program wants an older system. What is the program you are concerned about?

I won't recommend replacing K with J. Downsteam software (e.g. GATK) in sequencing data analysis might need an accurate score to obtain a best performance.

ADD COMMENT

Login before adding your answer.

Traffic: 1891 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6