Question: Fisher Exact Test, Strand Bias Of Gatk
3
gravatar for juancarlos
6.8 years ago by
juancarlos30
juancarlos30 wrote:

Hi

I have some questions about GATK, I hope to someone can help me :)

What is the Fisher exact test to calculate the Strand Biase in GATK? is weight or no? I haven't see any documentation about the statistical calculation.

Other questions, Is possible to obtain the number of Forwards reads and Reverse reads in a variation position?

Thank You very much

Best REgards

gatk • 8.6k views
ADD COMMENTlink modified 6.8 years ago • written 6.8 years ago by juancarlos30
7
gravatar for brentp
6.8 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

see: http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_annotator_FisherStrand.html

It's basically creating a 2x2 contigency table of

  1. #ref alleles on + strand
  2. #ref alleles on - strand
  3. #alt alleles on + strand
  4. #alt alleles on - strand

You can always get the read counts using pileup. I'm not sure if GATK reports them.

ADD COMMENTlink written 6.8 years ago by brentp23k
6
gravatar for Ashutosh Pandey
6.8 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

Brent is correct. Just to answer your other question: Unified Genotyper doesnt provide the reads for forward and reverse strands individually.

Also, if you have noticed there are two different tags for Strand Bias in GATK output 1) SB 2) FS .

SB is what Brent explained. But usually people use FS that is "Phred-scaled p-value using Fisher's exact test to detect strand bias" .

Normally you remove any SNP with FS > 60.0 and an indel with FS > 200.0

ADD COMMENTlink written 6.8 years ago by Ashutosh Pandey11k

good to know. you mean FS > 60 as a genome-wide cutoff? might want a smaller one for a subset or selected region.

ADD REPLYlink written 6.8 years ago by brentp23k

You may be right. I got these values from GATK web site and have seen few papers using it but I have no idea how they came up with these thresholds. When I use samtools that gives you the p-value for the strand bias, I remove variants with strand bias < .0001 and I assume that translates a Phred score of 40. so a smaller value should make more sense.

ADD REPLYlink modified 6.8 years ago • written 6.8 years ago by Ashutosh Pandey11k
0
gravatar for juancarlos
6.8 years ago by
juancarlos30
juancarlos30 wrote:

OK, is very clear the FS formulation and the Thresholds

Thank You very much brentp

ADD COMMENTlink written 6.8 years ago by juancarlos30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 858 users visited in the last hour