I'm new to CNV calling and the cnvkit software. I was wondering whether 1) there is a way to access the confidence of a copy number call from the cnvkit output (any other metrics besides CI and PI?). 2) what factors (sequencing coverage etc?) cause a wrong copy number call, and how do I see this from the cnvkit output?
I'm asking this because I've been comparing ERBB2 cnv calls from cnvkit using bcbio pipeline with clinical status of ERBB2 ("truth set") and a few samples don't tally. For example, the log2 value, CI and PI are more than 0 for these samples that are supposed to be ERBB2 negative.
Any ideas pls? Thanks!
Edit: In case it's useful, the samples are FFPE, sequenced with a capture protocol on several targeted genes, and the log2 scores are more than +/- 0.2 so I can't call these ERBB2 negative samples as copy-neutral.
Edit2: I can provide scatter plots from CNVkit output if necessary. At least for one sample the signal looks real with all points from the ERBB2 segment clustering nicely above log2=0. However it's highly unlikely to be biological because the met from the same sample loses the signal (cn=2). I was wondering whether it's due to the nature of FFPE resulting in spurious gains/losses?