Question: Low Empirical Scores For Exome Data While Base Quality Sscore Recalibration In Gatk
0
gravatar for anchal
6.4 years ago by
anchal10
anchal10 wrote:

I am analyzing exome sequencing data (~100x coverage). Pipeline includes BWA for alignment and then GATK for variation calling. While BQSR (one fthe steps in variation calling), the empirical base quality score values generated for my data are coming out to be very low. (I am looking at the average of the empirical values given in recal file=19). As a result the re-calibration plots are also not coming as they should be. (attached herewith).

Despite of getting bad empirical sores I went ahead and re-calibrated my data, and after re-calibration the average corrected base quality scores are as low as 16!!! I am not sure this re-calibration is fine or not as the confidence of variation calling will certainly go down with such low base quality scores.

My raw data and quality check otherwise looks fine. Also when I do re-calibration on a 30x coverage data, everything seems fine. I have no clue why empirical scores are going down for a high coverage data!

Can anyone help me in this?

gatk • 1.6k views
ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by anchal10

Have you looked at your data in a browser? Are there a lot of mismatches after alignment?

ADD REPLYlink written 6.4 years ago by Sean Davis25k
0
gravatar for anchal
6.4 years ago by
anchal10
anchal10 wrote:

I could solve the problem... Actually the problem was with BWA version I was using to align data (otherwise data quality was very good). Same data when aligned with BWA 0.5.9 gave very bad empirical scores whereas when aligned with 0.6.1 gave perfectly fine results. So be careful while using the aligner!!

ADD COMMENTlink written 6.4 years ago by anchal10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 641 users visited in the last hour