CNV not called
1
1
Entering edit mode
6.2 years ago
vps767 ▴ 20

I am attempting to use CNVkit, and have successfully run it on a test exome sequencing sample (non-cancer blood) vs a panel of other samples (non-cancer blood).  The test sample contains a very large homozygous deletion that should be trivial to detect.  The deletion is not called using cbs (default parameters or threshold 0.2) or flasso -- with warning like

DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False. data = self._reader.read(nrows)

Haarseg detects the deletion, but calls 1255 segments and the output has lost gene names.

The .cnr file clearly shows the deletion, so the early steps of processing are good.  cbs run manually is able to easily detect the deletion using default parameters either with or without weights, see below

 

cnvkit.py batch ../shortcuts/C1-25.bam -r /mnt/capture/cnvkit/Sept30_2015_reference.cnn  --output-dir results/

cnvkit.py segment C1-25.cnr -m cbs


library("DNAcopy")
datatab <- read.table("C1-25.cnr", header=T, comment.char="")
CNA.object <- CNA(cbind(datatab$log2),datatab$chromosome,datatab$start,data.type="logratio")
segment.CNA.object <- segment(CNA.object, verbose=1, weights=datatab$weight)

 

Any help would be greatly appreciated.

 

Vince

 

exome copy number cnvkit • 2.5k views
ADD COMMENT
0
Entering edit mode

Would you mind tagging this question with "cnvkit" so it's easier to find? I missed it earlier, sorry.

ADD REPLY
0
Entering edit mode
6.2 years ago
Eric T. ★ 2.7k

Thanks for the bug report. These two issues should be resolved now in the code on Github, and will be in the next CNVkit release:

  • There was a filter in place to remove very-low-coverage probes before segmentation; this makes sense for contaminated tumor samples but not for germline samples. I've made the filter optional.
  • For the HaarSeg issue, the gene names should show up now like they did for CBS and Fused Lasso.

Alternative workarounds (for posterity):

  • In absence of a .cns file for your sample, you can use CNVkit's gainloss command to identify the genes likely affected by CNVs.
  • If you've successfully segmented the .cnr file in raw R, if you print the output dataframe it's probably in SEG format or close to it, in which case you can import it back to CNVkit's .cns format with the import-seg command.
ADD COMMENT

Login before adding your answer.

Traffic: 1632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6