Tool: Revolution (?) in CNA detection using exome/targeted sequencing
gravatar for Irsan
5.8 years ago by
Irsan7.2k wrote:

Recently, a new paper (CopywriteR) was published describing copy number alteration (CNA) profiling using targeted sequencing (e.g. whole-exome sequencing). The "revolution" here is that CNA detection is based on off-target reads instead of on-target reads. This eliminates the problem that exome-baits have large variation in capture efficiency between each other and between samples/studies/batches. This way, the signal-to-noise ratio is drastically improved and approximates the signal-to-noise ratio obtained by whole-genome sequencing. Additional benefits are that no reference sample is required and CNA outside the targeted areas (so in case of WES this means the non-exonic genome) can be quantified.

The authors divided this tool in 3 stages:

  1. preprocessing
  2. identification of off-target reads and log2(copy number) (LRR) quantification
  3. segmentation and plotting

This modular design allows you for example to use CopywriteR for the preprocessing part and LRR quantificaiton while you can use your own personal choice for segmentation and visualization

The only drawback I can think of is that when you have very good capture efficiencies, the amount of off-target data drops and signal-to-noise will follow. In those cases I think you could increase the bin-size (trade off with resolution) to get good quality CNA profiles.

If you want to try it, make sure you have R version >3.2 and bioconductor version >3.1

Disclaimer: I was not involved in the design/development of this tool, nor involved in publication, finances or anything else. I just think that the off-target strategy is the best solution available for whole-exome sequencing CNA profiling.

ADD COMMENTlink modified 5.3 years ago by Eric T.2.6k • written 5.8 years ago by Irsan7.2k

Hi Irsan, just a small modification: it is possible to run CopywriteR under R 3.1 and Bioconductor 3.0 when using the version on GitHub; see for installation instructions. Best, Thomas

ADD REPLYlink written 5.8 years ago by thomaskuilman810
gravatar for Eric T.
5.3 years ago by
Eric T.2.6k
San Francisco, CA
Eric T.2.6k wrote:

CNVkit is another tool that uses off-target reads similarly, but in addition to the on-target reads. (I'm the author.) Performance is indeed much better when off-target reads are used for calling large-scale CNVs.

ADD COMMENTlink written 5.3 years ago by Eric T.2.6k

How do you define large scale CNVs? Say, larger than 20 kbps? Indeed I can imagine that for single-exon CNV/CNA/SCNA calling the on-target data is important

ADD REPLYlink written 5.1 years ago by Irsan7.2k

Large-scale could be >400kb, something that would show up on lower-resolution array CGH. At that scale, CNVkit and CopywriteR make accurate calls covering intergenic regions, but sequencing-based CNV callers that use only on-target reads can only guess from the coverage of nearby genes.

But the off-target regions also help support calls in genic regions by adding datapoints both within and just outside the CNV that can be used for segmentation, so CNVkit and even CopywriteR can perform well on focal CNVs, too. Calling CNVs smaller than a gene is a matter of using smaller bins for counting reads, and sequencing deeper to gather enough evidence to support the call.

In general it's a good idea to use multiple callers to detect structural variants at every scale, e.g. including LUMPY or DELLY and Pindel in the pipeline to find SVs that aren't well supported by read depth alone. Then you can use a program like MetaSV to merge the calls into a single VCF, to keep it clean.

ADD REPLYlink written 5.1 years ago by Eric T.2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2059 users visited in the last hour