Hello,

My group has finally started doing NGS experiments and since the first wave of results has landed in, we're checking the minimum coverage required to determine a mutation of a specific frequency (e.g. 10%) with a determined level of power (e.g. 90%), knowing that a mutation needs to be present in at least 2 reads on one strand and 1 on the other.

We have a table for that, but I'd like to have a formula (which is basically combinatorial probability) that can be generalized, so that I can put that into the pipeline doing QC after the reads are processed and aligned. Is there any paper that touches his issue? Or can a formula, given the starting parameters, be calculated easily?

Thanks in advance.

Is this pooled seq?

Good question, I hope someone can give some clues.

If you have your results in vcf format, vcftools can do it easily.

For which ploidity? haploid or diploid?