What Coverage allele-fraction threshold to use?
0
0
Entering edit mode
3.0 years ago

I UV mutated a haploid algae genome and want to view the variants in the bam file on IGV. What Coverage allele-fraction threshold should I use to look at variants?

allele-fraction snps • 1.5k views
ADD COMMENT
0
Entering edit mode

Depends on your sequencing coverage and desired False Discovery Rate.

ADD REPLY
0
Entering edit mode

The sequencing coverage is >200x

ADD REPLY
0
Entering edit mode

At each genomic position you have a specific coverage, say X. Y out of X reads may support an alternative variant, X - Y - reference. You need to perform a statistical test (there are tens of types of stat tests, more and less sophisticated) how unusual is to see Y reads given the error rate of your sequencing machine of Z. You can simply use a Binomial test. Then you get a bunch of p-values. You put these p-values into some FDR correction procedure and get your approximate threshold.

This procedure has its drawbacks and is not exactly correct, but may be useful.

ADD REPLY
0
Entering edit mode

Can you point me to some statistical tests?

ADD REPLY
0
Entering edit mode

For example, your probability of error as a substitution is 0.1% (in Illumina machines it is a very small number). You see 5 nucleotides A and 195 nucleotides B at some position. You apply Binomial test and find a p-value ( https://en.wikipedia.org/wiki/Binomial_test )

This is the simplest test, then people model errors with more efficient regression models with various link functions, but you may start with Binomial.

ADD REPLY

Login before adding your answer.

Traffic: 1660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6