Question: tumor/normal WGS with CNVkit has many small segments with small copy number changes
gravatar for swheelan
29 days ago by
swheelan10 wrote:

Hi! We've used CNVkit fairly extensively and are quite happy with it, thanks. We have rat tumor/normal WGS samples now, and CNVkit generates many, many small (~20kb) segments that oscillate between copy number 0 and copy number -0.5. The confidence intervals do not overlap each other and for the segments around -0.5 the ci do not overlap zero. The coverage is good, alignments were good, and nothing else seems odd about these samples, but this output is pretty strange. Any suggestions? Thanks!

wgs cnv cnvkit • 174 views
ADD COMMENTlink modified 27 days ago by Eric T.2.0k • written 29 days ago by swheelan10

You can post the figure generated by CNVKit.

ADD REPLYlink written 29 days ago by chen1.5k

I haven't tried posting pictures before so apologies if this didn't work! but here's a shot of the scatter plot. scatterplot WGS tumor/normal

ADD REPLYlink written 29 days ago by swheelan10

enter image description here

ADD REPLYlink written 29 days ago by swheelan10

1) Are the tumor and normal samples from the same rat? Or just from the same strain? There can be more heterogeneity than you expect sometimes.
I'm still not sure whether that would explain that result, though...

ADD REPLYlink written 29 days ago by Chris Miller19k

They're from the same rat.

ADD REPLYlink written 29 days ago by swheelan10

Interestingly, we just ran Control-FREEC on the same data and got a small handful of discrete called CNVs, everything else normal ploidy.

ADD REPLYlink written 29 days ago by swheelan10
gravatar for Eric T.
27 days ago by
Eric T.2.0k
San Francisco, CA
Eric T.2.0k wrote:

How did you run CNVkit? It looks as if it was run with two separate coverage profiles (genic and intergenic?) that were not normalized to each other when combining, so half the bins have log2 values shifted downward by 0.5. This could have happened if you ran batch with WGS data but did not use -m wgs, for example.

ADD COMMENTlink written 27 days ago by Eric T.2.0k

Hmm, that would be interesting indeed. Here's the command (paths and names shortened): batch tumor.sorted_RG_noDup.bam --normal normal.sorted_RG_noDup.bam -m wgs --fasta /path/rat/rn6/rn6.fa --annotate /path/rat/rn6/rn6_UCSC_gene_merged.bed --output-reference tn.reference.cnn --output-dir ./tvsn

ADD REPLYlink written 22 days ago by swheelan10

Have you used normal.sorted_RG_noDup.bam as a reference elsewhere? If there is something odd about the coverage in that normal sample in particular, that could shift the normalized log2 ratios for tumor.sorted_RG_noDup.bam -- so contamination with an enriched exome in either sample would yield similarly weird results. I recommend using a pooled reference of multiple normal samples to reduce the risk of this happening and generally reduce noise in the results.

You can also try using flasso as the segmentation method (with the segment command) instead of the default cbs, as fused lasso seems to work better on large datasets like WGS.

ADD REPLYlink written 17 days ago by Eric T.2.0k

Thanks- we'll try this & I'll check out that reference sample.

ADD REPLYlink written 17 days ago by swheelan10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 906 users visited in the last hour