Question: CNVkit : small CNV calling
0
gravatar for alice.choury
8 months ago by
alice.choury50
alice.choury50 wrote:

Hello every one,

For a few weeks, we have been using CNVkit to detect CNVs of the size of a gene in our somatic panel. Enthused by the results, we carried out our study on our constitutional panel this time. Most CNVs detected in this panel measure 1 to 3 exons. These CNVs are not seen by CNVkit. Is this normal?

Thank you for your answer,

Alice

cnv single-exon cnvkit • 565 views
ADD COMMENTlink modified 7 months ago by Eric T.1.7k • written 8 months ago by alice.choury50
1
gravatar for alice.choury
7 months ago by
alice.choury50
alice.choury50 wrote:

I search in the manual and I found this sentence :

However, note that CNVkit is less accurate in detecting CNVs smaller than 1 Mbp, typically only detecting variants that span multiple exons or captured regions. When used on exome or target panel datasets, CNVkit will not detect the small CNVs that more common in populations.

But if we create a BED file with small regions (eg 25 or even 12 bp) with the -a option, it is possible to see CNVs of small sizes, up to 2 contiguous exons.

ADD COMMENTlink modified 7 months ago • written 7 months ago by alice.choury50
1

Hi Alice,

I am also planning to use CNVkit for my constitutional samples. Just wondering about the lower size limit of CNV that CNV kit can detect. What is the size of 2 contiguous exons that you detected? It will be helpful if you could you please provide more details on the bed file that you generated/ command line that you used to create the BED file with smaller regions.

Thank you in advance

ADD REPLYlink written 7 months ago by jainythomas110
1

Our capture is about 420 kb (only 35 gènes) . We sequence samples with a depth of coverage about 300b. Finally the 2 CNV detected (no false positives) have a size of 2411 and 98 bases. Two other CNVs of 1 exon were tested, and they could not be detected. They measure 54 and 53bp.

For the command line :

cnvkit.py target Capture.bed --split -a 12-o my_targets.bed

cnvkit.py antitarget my_targets.bed -a 15000 -g data/access-5k-mappable.hg19.bed -o my_antitargets.bed

1500 is the value that makes it possible to obtain an average of similar coverage whether in target and off-target.

data/access-5k-mappable.hg19.bed is a file in the cnvkit directory.

The baseline is created with all the samples of the run (positives are unknown). The rest of the command (coverage, fix, segmentation CBS and call threshold) has not been modified from what is proposed in the manual.

Do not hesitate if you have further questions

Alice

ADD REPLYlink modified 7 months ago • written 7 months ago by alice.choury50
0
gravatar for Eric T.
7 months ago by
Eric T.1.7k
San Francisco, CA
Eric T.1.7k wrote:

Yes, CNVkit and other segmentation-based copy number callers struggle to accurately detect CNVs in constitutional samples. Using default settings, a single-exon CNV won't show up with the segmenters currently available (though you could in theory use the 'spread' and 'log2' columns in a pooled reference as the basis for a Z-test of each exon in your capture -- this is not yet supported directly).

If your sequencing data are high quality then you can subdivide the targets and antitargets more finely, as your other comment mentions, though this can result in more noise as well. Then if you've managed to increase the sensitivity of CNVkit on your data and are now seeing poor specificity, you can reduce false positives with the segmetrics --ci and call --filter ci commands.

ADD COMMENTlink written 7 months ago by Eric T.1.7k

I would like to try the "in theory" method you mentioned to attempt to get the cnv for single exon. I understand how the 'log2' column can be used. But how the 'spread' can be used for the Z-test, can you share your ideas?

ADD REPLYlink written 13 days ago by qyang20

In my experiences, we can detect cnv for single exon for constitutional samples. We use capture, amplicon doesn't works well. The target is cut bin of 20 bases. The antitarget is cut depending of the depth and the on-target. (We have 100X and 70% on-target so we cut around 20000 bases) And then we run cnvkit like in the manual.

If you have question, do not hesitate !

Alice

ADD REPLYlink written 11 days ago by alice.choury50

You can use the spread value as an estimate of variance, so the square root of that is your standard deviation parameter, and log2 is the mean.

ADD REPLYlink written 10 days ago by Eric T.1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1166 users visited in the last hour