Question: Converting Illumina fastq quality scores to phred
0
4.1 years ago by
L. A. Liggett120
Broad Institute, Harvard Medical School, Boston Children's
L. A. Liggett120 wrote:

I have been trying to understand how to calculate probabilities of correct calls in my Illumina dna sequencing results coming off of both the nextseq and the hiseq 4000. My understanding is that these can use different formats like illumina 1.7 or 1.8 and that the quality scores will translate to a phred score differently.

So, is there a simple ascii conversion that can be used to convert the illumina quality scores into a phred score which can then be used with a formula like Q=-10log10P to compute the probability that the base call is correct?

sequencing • 6.5k views
modified 11 months ago by zubenel110 • written 4.1 years ago by L. A. Liggett120

Just so you know, you still have to know which version was used, else you'll be doing +64 to everything not +33. Perhaps the nextseq/hiseq 4000 are such new machines they never even came with anything <1.8

1

They are - Illumina's software for those platforms is ASCII-33 exclusively.

Ah awesome :)

3
4.1 years ago by
L. A. Liggett120
Broad Institute, Harvard Medical School, Boston Children's
L. A. Liggett120 wrote:

I came across the answer to this in the biostar handbook, it is super easy to calculate using the following code:

`python -c 'print ord("A")-33'`

And this can be easily converted to the probability of correct call as well:

`python -c 'from math import*; print 10**-((ord("A")-33)/10.0)'`

That second line shouldn't be importing math for no reason - i will fix it

1
4.1 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

I actually read through this and looked up an ASCII conversion table, but I still don't understand how to do the conversion. Could you explain?

1

Subtract 33 from each of the quality scores to get the Phred score. For example, 'A' is ASCII letter 65. 'A'-33 = 65-33 = 32. So a quality score of 'A' means Phred 32, or slightly better than 99.9%.

0
11 months ago by
zubenel110
zubenel110 wrote:

Assuming phred+33 another easy solution is to use Perl oneliner like this:

`perl -E 'say ord("A")-33'`