Question: tumor/normal WGS with CNVkit has many small segments with small copy number changes
1
gravatar for swheelan
9 months ago by
swheelan10
swheelan10 wrote:

Hi! We've used CNVkit fairly extensively and are quite happy with it, thanks. We have rat tumor/normal WGS samples now, and CNVkit generates many, many small (~20kb) segments that oscillate between copy number 0 and copy number -0.5. The confidence intervals do not overlap each other and for the segments around -0.5 the ci do not overlap zero. The coverage is good, alignments were good, and nothing else seems odd about these samples, but this output is pretty strange. Any suggestions? Thanks!

wgs cnv cnvkit • 586 views
ADD COMMENTlink modified 9 months ago by Eric T.2.3k • written 9 months ago by swheelan10

You can post the figure generated by CNVKit.

ADD REPLYlink written 9 months ago by chen1.7k

I haven't tried posting pictures before so apologies if this didn't work! but here's a shot of the scatter plot. scatterplot WGS tumor/normal

ADD REPLYlink written 9 months ago by swheelan10

enter image description here

ADD REPLYlink written 9 months ago by swheelan10

1) Are the tumor and normal samples from the same rat? Or just from the same strain? There can be more heterogeneity than you expect sometimes.
I'm still not sure whether that would explain that result, though...

ADD REPLYlink written 9 months ago by Chris Miller20k

They're from the same rat.

ADD REPLYlink written 9 months ago by swheelan10

Interestingly, we just ran Control-FREEC on the same data and got a small handful of discrete called CNVs, everything else normal ploidy.

ADD REPLYlink written 9 months ago by swheelan10
1
gravatar for Eric T.
9 months ago by
Eric T.2.3k
San Francisco, CA
Eric T.2.3k wrote:

How did you run CNVkit? It looks as if it was run with two separate coverage profiles (genic and intergenic?) that were not normalized to each other when combining, so half the bins have log2 values shifted downward by 0.5. This could have happened if you ran batch with WGS data but did not use -m wgs, for example.

ADD COMMENTlink written 9 months ago by Eric T.2.3k

Hmm, that would be interesting indeed. Here's the command (paths and names shortened): cnvkit.py batch tumor.sorted_RG_noDup.bam --normal normal.sorted_RG_noDup.bam -m wgs --fasta /path/rat/rn6/rn6.fa --annotate /path/rat/rn6/rn6_UCSC_gene_merged.bed --output-reference tn.reference.cnn --output-dir ./tvsn

ADD REPLYlink written 9 months ago by swheelan10

Have you used normal.sorted_RG_noDup.bam as a reference elsewhere? If there is something odd about the coverage in that normal sample in particular, that could shift the normalized log2 ratios for tumor.sorted_RG_noDup.bam -- so contamination with an enriched exome in either sample would yield similarly weird results. I recommend using a pooled reference of multiple normal samples to reduce the risk of this happening and generally reduce noise in the results.

You can also try using flasso as the segmentation method (with the segment command) instead of the default cbs, as fused lasso seems to work better on large datasets like WGS.

ADD REPLYlink written 9 months ago by Eric T.2.3k

Thanks- we'll try this & I'll check out that reference sample.

ADD REPLYlink written 9 months ago by swheelan10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 672 users visited in the last hour