Question: BaseQRankSum and FS
9 months ago
Tania120

Hi All

Can anyone give me a roughly cutoff for BaseQRankSum and FS in vcf? I need to know a hint about what is too low to accept or what is a good? Something like this, I can't interpret:

BaseQRankSum=1.875;DP=1987;Dels=0.00;FS=3.500;HaplotypeScore=1.1051;InbreedingCoeff=-0.0043;MLEAC=3;MLEAF=0.019;MQ0=1;MQ=58.73;MQRankSum=0.871;QD=13.15;ReadPosRankSum=-0.989;SOR=1.630        GT:AD:DP:GQ:PL  0/1:7,14:21:99:383,0,139


24 days ago
miaowzai100

I haven't seen anyone using these two values for hard filtering. But BaseQRankSum is basically a z-score for base qualities of reference and alternative alleles. Having a BaseQRankSum close to 0 means the reference and alternative alleles have the same base qualities and having BaseQRankSum around 2 means they differ by 2 SDs (positive 2 means alternative alleles have higher qualities). If this value is away from 0, there's probably some sequencing bias. I don't know much detail about that. I'm guessing you could use either 0.5, 1, 1.5, 2 as the cutoff for this value, depending on how strict your filtering is and how many loci are left after your filtering. Reference:

FS means fisher strand. It is another measure of sequencing bias. It measures in squencing if one strand is preferred than the other one. Larger values means larger bias. I suggest you extract all your FS and plot them in a histogram and choose a cutoff depending on how many loci you'd like to retain. Reference:

miaowzai100
