Question

How To Best Determine Sample Size For Clonal Sequencing?

0

Entering edit mode

12.4 years ago

Mark Evans ▴ 50

Hello,

Our lab is going to do Sanger sequencing of clonal viral isolates from some infected patients. The question that has been asked is what is the accepted method for establishing confidence levels around the detection of a 1 percent minor variant by picking X clones for sequencing?

In other words, how many clones do we need to pick per patient so that we can be very confident that we can still detect a minor virus variant, even if it is present at only 1% frequency among the patient's viral load?

I am hoping that this is a pretty common problem and that is has been solved (and published) somewhere, although I can't seem to find any useful references.

References, equations, ideas, comments are welcome.

Thanks.

Mark

sanger sequencing statistics • 2.2k views

ADD COMMENT • link updated 12.4 years ago by Swbarnes2 ★ 1.6k • written 12.4 years ago by Mark Evans ▴ 50

score 1 · Answer 1 · 2011-11-17

I think the simplest calculations run as follows:

If 1% of your sample is mutated, and you sequence one clone, the odds of you missing that mutant are 99%. If you sequence two clones, the odds of them both missing is .99*.99 = 98%

Sequence 50 clones, and the odds of you missing the 1% variant are (.99)^50, or 60%.

Sequence 100, the odds of missing the variant are 36.6%

Sequence 200, the odds of missing the variant are 13.4%

Sequence 300, and the odds are < 5%.

So I'd say with 300 clones, you will see about 95% of the 1% variants.

I don't think sanger sequening is the technique of choice here. You pool ten patients together, and get the data Illumina sequenced, you can spot a 1/10 mutation confidently. Maybe you can spot a 1/100 mutation, we've done that here with spiked samples and super high coverage.