DNAcopy for targeted sequencing
1
1
Entering edit mode
6.6 years ago
bdelolmo ▴ 10

Hello,

I am trying DNAcopy for a gene panel, as expected, it merges the consecutive exons that have similar logratios. The problem comes when an entire gene has signals of a CNV. In that case DNAcopy finds the start of the CNV, but it also reports the first coordinate of the next gene as abnormal (when it should not).

For example:

I have a gene that covers chr6:7542142-7586118 (24 exons). I found 16 consecutive exons with a deletion 7567581 to 7586118 (from exon 9 to 24).

The DNAcopy output is like:

chr6 7542142 7566604 8 1.0275

chr6 7567581 118879078 17 0.5682

Which includes the first exon (chr6:118879078) of the next gene, which has a normal copy number ratio. Does any know how can I avoid this?

Here is the code that I use:

  "library(DNAcopy)";
"cn <- read.table(\"$ratiosRD\", header=F)"; "CNA.object <-CNA( genomdat = cn[,$i], chrom = cn[,1], maploc = cn[,2], data.type = 'logratio')\n";
"CNA.smoothed <- smooth.CNA(CNA.object)";
"segs <- segment(CNA.object, verbose=0, min.width=2)";
"segs2 = segs\$output"; "write.table(segs2[,2:6], file=\"$segmented.$SAMPLES[$j].bed\", row.names=F, col.names=F, quote=F, sep=\"\t\")";

DNAcopy CNV exome gene panel • 2.7k views
3
Entering edit mode
6.6 years ago
Eric T. ★ 2.7k

Three approaches:

• CNVkit uses off-target reads between genes to detect CNV breakpoints, if your targeted sequencing protocol uses hybrid capture. In most of your cases this should help place the breakpoint correctly in the intergenic region instead of at the edge of the next targeted exon. Otherwise the CNVkit pipeline is conceptually similar to the workflow you're probably using.
• If your protocol is targeted amplicon sequencing instead, and there are no off-target sequencing reads to improve your estimates, then try OncoCNV. After running CBS via DNAcopy, it uses another statistical test at the gene level to remove spurious breakpoints. I think this might trim the CNV to the gene of interest in your problematic cases.
• The R package PSCBS, which wraps DNAcopy, has a procedure for estimating confidence intervals around breakpoints, which would help you detect breakpoints with questionable positioning near gene boundaries (if you don't mind some additional programming).