Question: CNV not called
1
gravatar for vps767
3.6 years ago by
vps76710
United States
vps76710 wrote:

I am attempting to use CNVkit, and have successfully run it on a test exome sequencing sample (non-cancer blood) vs a panel of other samples (non-cancer blood).  The test sample contains a very large homozygous deletion that should be trivial to detect.  The deletion is not called using cbs (default parameters or threshold 0.2) or flasso -- with warning like

DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False. data = self._reader.read(nrows)

Haarseg detects the deletion, but calls 1255 segments and the output has lost gene names.

The .cnr file clearly shows the deletion, so the early steps of processing are good.  cbs run manually is able to easily detect the deletion using default parameters either with or without weights, see below

 

cnvkit.py batch ../shortcuts/C1-25.bam -r /mnt/capture/cnvkit/Sept30_2015_reference.cnn  --output-dir results/

cnvkit.py segment C1-25.cnr -m cbs


library("DNAcopy")
datatab <- read.table("C1-25.cnr", header=T, comment.char="")
CNA.object <- CNA(cbind(datatab$log2),datatab$chromosome,datatab$start,data.type="logratio")
segment.CNA.object <- segment(CNA.object, verbose=1, weights=datatab$weight)

 

Any help would be greatly appreciated.

 

Vince

 

exome copy number cnvkit • 1.8k views
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by vps76710

Would you mind tagging this question with "cnvkit" so it's easier to find? I missed it earlier, sorry.

ADD REPLYlink written 3.6 years ago by Eric T.2.4k
0
gravatar for Eric T.
3.6 years ago by
Eric T.2.4k
San Francisco, CA
Eric T.2.4k wrote:

Thanks for the bug report. These two issues should be resolved now in the code on Github, and will be in the next CNVkit release:

  • There was a filter in place to remove very-low-coverage probes before segmentation; this makes sense for contaminated tumor samples but not for germline samples. I've made the filter optional.
  • For the HaarSeg issue, the gene names should show up now like they did for CBS and Fused Lasso.

Alternative workarounds (for posterity):

  • In absence of a .cns file for your sample, you can use CNVkit's gainloss command to identify the genes likely affected by CNVs.
  • If you've successfully segmented the .cnr file in raw R, if you print the output dataframe it's probably in SEG format or close to it, in which case you can import it back to CNVkit's .cns format with the import-seg command.
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Eric T.2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1568 users visited in the last hour