Question: Why Are The Samtools/Bcftools Pv4 T-Tests One Sided?
3
gravatar for Casbon
8.8 years ago by
Casbon3.2k
Casbon3.2k wrote:

From the mpileup page, we have the definition:

PV4: P-values for 1) strand bias (exact test); 2) baseQ bias (t-test); 3) mapQ bias (t); 4) tail distance bias (t)

Looking at the source for this t-test (I couldn't find any further documentation), we can can see on line 61:

if (u1 <= u2) return 1.;

At this point, u1 and u2 are the mean values of interest. So this t-test returns one if u2 is larger than u1. So, for example, if we are considering mapping quality we return one (accept null hypothesis that sample means are the same). This means we only test if the mapping quality is lower in the non reference reads.

Why not use a two sided t-test to test for differences in means between quantities of interest?

vcf bcftools mpileup statistics • 3.1k views
ADD COMMENTlink modified 8.7 years ago by lh332k • written 8.8 years ago by Casbon3.2k
2
gravatar for lh3
8.8 years ago by
lh332k
United States
lh332k wrote:

Reads with fewer mismatches to the reference are mapped better. baseQ is BAQ adjusted. Mismatching bases tend to have lower BAQ, too. The same is true for the distance to the end of a read.

Practically, one-tail and two-tail tests have negligible effect on the final SNPs.

ADD COMMENTlink written 8.8 years ago by lh332k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1843 users visited in the last hour