I have a statistical problem I need some help with. I was wondering how could I calculate what the probability of finding a 291 bp long V4 region of the 16S rRNA gene in any 1000 bp fragment of a whole genome dna extract would be. If the genome is 3,5 Mbp long.
In other words how many of the 1000 bp fragments would be expected to contain the said region?
Considering two cases: on average 2.7 copies of the gene in a genome or 5.4 copies.