Question

Bias During Exome Capture For Cnv Analysis

3

Entering edit mode

13.4 years ago

Vikas Bansal ★ 2.4k

Hi everyone,

I have some confusion regarding "differences in probe affinities" during exome capture for CNV analysis. I have heard a lot that about this bias during exome capture, although I have read some papers which offers CNV analysis for whole exome. But I want to know, what if a person has sequenced only 1000 genes? There is no tool for CNV analysis for this kind of data and what if person does not have control sample? The main question arises, what kind of affinities we are talking about because this bias is going to be happen for whole exome as well as if we capture only some genes. I would like to know your views on this, that what is the main reason, that we are not able to analyze this kind of data?

Thanks and Best regards,

Vikas

cnv exome next-gen sequencing • 7.5k views

ADD COMMENT • link updated 13.3 years ago by Chris Miller 22k • written 13.4 years ago by Vikas Bansal ★ 2.4k

score 25 · Answer 1 · 2012-02-27

Copy number analysis from NGS is based on the idea of read depth - that is, that all other things being equal, a region with a copy number of 4x will have twice as many reads as a region with a copy number of 2x. This works well with whole genome sequencing. Even though there are slight biases, due to things mapability and gc content, they can be corrected for fairly easily and accurate copy number can be assessed.

Exome sequencing, or targeted capture, is a whole different story. Factors like the GC content of the probes, the concentration of DNA, and even the temperature of hybridization, will make the number of reads captured by each probe different. These differences are often dramatic. This means that even for two regions that are both at a copy number of 2x, the number of sequencing reads can be off by orders of magnitude.

This data is essentially useless for copy number calling using standard methods. There is, however, one exception. Some smart people have figured out that if a tumor sample and matched normal are prepared at the same time, under the same conditions (same tech, same reagent batch, etc), then the biases will be roughly the same, and you can use the ratio between the two as a reasonable proxy for copy number. At the moment, I know of no other way to get accurate calls from capture-based sequencing.

Undoubtedly, people are working on the problem, and I don't claim that it's intractable - just quite difficult. In the meantime, there isn't a good way to get the information you're looking for from the data you have. Perhaps you should consider an alternate assay. Running a 500k SNP chip will give you fairly good resolution at a reasonable low price.