Question: Converting Illumina fastq quality scores to phred
0
gravatar for L. A. Liggett
3.3 years ago by
L. A. Liggett120
Broad Institute, Harvard Medical School, Boston Children's
L. A. Liggett120 wrote:

I have been trying to understand how to calculate probabilities of correct calls in my Illumina dna sequencing results coming off of both the nextseq and the hiseq 4000. My understanding is that these can use different formats like illumina 1.7 or 1.8 and that the quality scores will translate to a phred score differently.

So, is there a simple ascii conversion that can be used to convert the illumina quality scores into a phred score which can then be used with a formula like Q=-10log10P to compute the probability that the base call is correct?

sequencing • 5.3k views
ADD COMMENTlink modified 7 weeks ago by zubenel70 • written 3.3 years ago by L. A. Liggett120

Just so you know, you still have to know which version was used, else you'll be doing +64 to everything not +33. Perhaps the nextseq/hiseq 4000 are such new machines they never even came with anything <1.8

ADD REPLYlink written 3.3 years ago by John12k
1

They are - Illumina's software for those platforms is ASCII-33 exclusively.

ADD REPLYlink written 3.3 years ago by Brian Bushnell17k

Ah awesome :)

ADD REPLYlink written 3.3 years ago by John12k
3
gravatar for L. A. Liggett
3.3 years ago by
L. A. Liggett120
Broad Institute, Harvard Medical School, Boston Children's
L. A. Liggett120 wrote:

I came across the answer to this in the biostar handbook, it is super easy to calculate using the following code:

python -c 'print ord("A")-33'

And this can be easily converted to the probability of correct call as well:

python -c 'from math import*; print 10**-((ord("A")-33)/10.0)'

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by L. A. Liggett120

That second line shouldn't be importing math for no reason - i will fix it

ADD REPLYlink written 3.3 years ago by John12k
1
gravatar for Devon Ryan
3.3 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

See the wikipedia page, noting that everything is phred+33 these days.

ADD COMMENTlink written 3.3 years ago by Devon Ryan94k

I actually read through this and looked up an ASCII conversion table, but I still don't understand how to do the conversion. Could you explain?

ADD REPLYlink written 3.3 years ago by L. A. Liggett120
1

Subtract 33 from each of the quality scores to get the Phred score. For example, 'A' is ASCII letter 65. 'A'-33 = 65-33 = 32. So a quality score of 'A' means Phred 32, or slightly better than 99.9%.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by Brian Bushnell17k
0
gravatar for zubenel
7 weeks ago by
zubenel70
zubenel70 wrote:

Assuming phred+33 another easy solution is to use Perl oneliner like this:

perl -E 'say ord("A")-33'

ADD COMMENTlink written 7 weeks ago by zubenel70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1111 users visited in the last hour