Choosing the best number of species for assay

0

Entering edit mode

2.3 years ago

Nhan • 0

I'm doing amplicon sequencing of a virus across many different regions. Lets say I have 20k unique species that I put into my pcr assay and after sequencing and amplifications I am left with, 19k species that appear. But many of the species appear at really low read count and I don't know if that read is real or just noise. I have a threshold T that I say if a variant appears above T I'm counting it as real. Out of the 20k species about only 15% of species appear to be "real". I have some data on different input species counts (2k, 10k, 20k, 40k), and their corresponding fraction of "real".

My question is how do I determine which input species count is best? I only looked at 4 different values and maybe the correct value isn't sampled, I obviously can't try everything between 2k and 40k. I think I want a balance between largest fraction of "real" species and total number of species.

Is there any research (or better yet code) that answers/discusses this problem?

Thanks in advanced.

ampseq pcr • 532 views

ADD COMMENT • link 2.3 years ago by Nhan • 0

Login before adding your answer.