Question

How to assess confidence of cnvkit calls

0

Entering edit mode

7.7 years ago

phusion • 0

Hi,

I'm new to CNV calling and the cnvkit software. I was wondering whether 1) there is a way to access the confidence of a copy number call from the cnvkit output (any other metrics besides CI and PI?). 2) what factors (sequencing coverage etc?) cause a wrong copy number call, and how do I see this from the cnvkit output?

I'm asking this because I've been comparing ERBB2 cnv calls from cnvkit using bcbio pipeline with clinical status of ERBB2 ("truth set") and a few samples don't tally. For example, the log2 value, CI and PI are more than 0 for these samples that are supposed to be ERBB2 negative.

Any ideas pls? Thanks!

Edit: In case it's useful, the samples are FFPE, sequenced with a capture protocol on several targeted genes, and the log2 scores are more than +/- 0.2 so I can't call these ERBB2 negative samples as copy-neutral.

Edit2: I can provide scatter plots from CNVkit output if necessary. At least for one sample the signal looks real with all points from the ERBB2 segment clustering nicely above log2=0. However it's highly unlikely to be biological because the met from the same sample loses the signal (cn=2). I was wondering whether it's due to the nature of FFPE resulting in spurious gains/losses?

cnv cnvkit • 2.9k views

ADD COMMENT • link updated 7.7 years ago by Kevin Blighe 89k • written 7.7 years ago by phusion • 0

score 3 · Answer 1 · 2017-10-25

3

Entering edit mode

7.7 years ago

Kevin Blighe 89k

Hi phusion,

Detecting ERBB2/HER2 amplification (in circulating free DNA) was actually my very first publication: https://www.ncbi.nlm.nih.gov/pubmed/21427727 I have also worked with FFPE.

The FFPE process will affect samples in ways that is for now unpredictable, which would explain why you have signal in some samples whilst not others. For sections of DNA that have become cross-linked during the process of FFPE, I would normally expect these to remain inaccessible and thus show up as copy number losses on your assays. If a gain is detected from cnvKit over these regions it may be due to gross non-uniformity of coverage that makes it appear as if there is copy number gain - you will have to check the output across ERBB2 and its flanking regions, which can also be amplified in HER2-positive breast cancer.

Regarding cnvKit output, specifically, I direct you to this thread: Help with understanding CNVkit output

ADD COMMENT • link 7.7 years ago by Kevin Blighe 89k

0

Entering edit mode

Yes I've noticed that for these false positives generally they have lower coverage (about 10-20X, compared to the true positives which usually have >100X). But coverage looks relatively even to me except for a high peak right where ERBB2 is. The resolution of the flanking regions is poor probably due to the use of a small targeted panel of less than 100 genes. I've also happened to notice that these samples have extremely high (>70%) rates of duplication resulting in low coverage due to PCR dup removal in cnvkit. I guess you are right that large portions of the DNA has been degraded. But I would have thought that degradation occurs evenly across the genome and it seems odd that only ERBB2 is "spared"...

ADD REPLY • link 7.7 years ago by phusion • 0

0

Entering edit mode

Yes, I'm not so sure on that final point about degradation showing an even profile. In fact, work by another colleague of mine shows that it can be very uneven and that it's based on various factors, not just GC content.

Did you whole genome amplify the samples prior to using them (or just the target regions)? That can also result in uneven coverage, when used with FFPE.

From what I understand, though, despite the fact that NGS may not be ideal for this purposes when using FFPE, a simple qPCR can work quite well.

ADD REPLY • link 7.7 years ago by Kevin Blighe 89k

0

Entering edit mode

Yes we did whole genome amplify it indirectly, as during the library prep step, the fragmented, pooled and barcoded library was subjected to PCR-amplification before the target hybridization step was done. Thanks for the qPCR suggestion - we may try that out.

ADD REPLY • link 7.7 years ago by phusion • 0