17 months ago
JM • 0

Hello,

Can you help me understand why the following variant has a fisher stand (FS) phred-scaled P value of 0?

11      65107949        .       G       A       48.06   PASS    AC=1;AF=0.500;AN=2;DP=73;FS=0.000;MQ=245.31;MQRankSum=7.260;QD=0.66;ReadPosRankSum=4.771;SOR=0.633;FractionInformativeReads=1.000;R2_5P_bias=-4.567 GT:AD:AF:DP:F1R2:F2R1:GQ:PL:GP:PRI:SB:MB        0/1:42,31:0.425:73:25,19:17,12:47:83,0,50:4.806e+01,8.983e-05,5.300e+01:0.00,34.77,37.77:25,17,18,13:24,18,14,17


I know FS isn't calculated when, for example, there are no forward (or reverse) strands. Are there other situations when it's not calculated?

Thanks very much

fisher strand FS gatk • 531 views
What tool generated this data? Including the command line can be helpful too.

From what I can interpret of the data you posted originally, it looks like quite a normal HET, with basically balanced reads. What did you expect of the FS field?

This one https://gatkforums.broadinstitute.org/gatk/discussion/6724/question-for-fisherstrand seems to say that high FS indicates strand bias, and possibly false positives from low coverage regions. Makes sense that a position with "25,19:17,12" has really 0 apparent strand bias.

While it seems fairly balanced, I'm surprised that it is 0.000 (a p-value of 1). A p-value of 0.999 would still generate a phred scaled p value of 0.004.

p-values are not effect sizes. Anything p>0.2 is equivalent to saying 'no evidence' you shouldn't judge the difference between a p=0.5 and a p=0.9 to mean anything at all. They tend to get conflated with strength of association, because in introductory statistics they may appear to be correlated, but in complicated situations like sequencing quality especially, 'all other things equal' does not hold. So I wouldn't be surprised if the GATK folks just have the system fail to FS=0 because calculating the p=0.999 is a waste of time.