Ask About The Empirical Quality Score Mentioned In Gatk Recalibration Step
2
1
Entering edit mode
9.3 years ago
Liye Zhang ▴ 80

Hi,

I tried to find more details about the empirical quality score(more specifically, figure 3 in the paper: http://www.nature.com/ng/journal/v43/n5/abs/ng.806.html used in the comparison provided in the GATK paper on quality score recalibration, but I could not find it. I wonder someone can give me some ideas on how are empirical quality score caluculated and obtained? As I understand, there is sequencing bias in base pair composition and read length, therefore the recalibrated quality score should be better. Still, it will be great if someone can explain or elaborate on the concept of empirical quality score a little bit more.

Thanks.

gatk quality scoring • 3.3k views
ADD COMMENT
0
Entering edit mode

Have you looked at the methods section of the paper that you are referring to? It seems to describe the mathematical background of the base quality recalibration. I'm sorry I can't give you a better answer than that - my own understanding of the procedure doesn't extend further than that.

ADD REPLY
2
Entering edit mode
9.3 years ago

You should consult the online methods sections here http://www.nature.com/ng/journal/v43/n5/extref/ng.806-S1.pdf

There you will find the Base miscalling confusion matrices section that describes the how they found the empirical error rates that is specific to each platform and each miscall.

The way I understood this is that various miscalls have different rates of occurring even though that the reported quality may be the same.

ADD COMMENT
0
Entering edit mode
9.3 years ago

GATK's base quality score recalibration step is broadly described in its wiki page. the way I understand this step is, roughly speaking, like removing the background noise of analog signals. once removed known polymorphic sites from this process (dbSNP sites are suggested to be used here), the idea is to normalize all the base qualities by lowering the "noise" they may share. this way you'll be able to find low signals (low quality bases) that would be otherwise "lost in translation".

ADD COMMENT

Login before adding your answer.

Traffic: 1040 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6