Question: CNVkit: Choice of Bin Size and CNV Calling
1
gravatar for wei.wei
8 months ago by
wei.wei10
wei.wei10 wrote:

I noticed that the choice of target bin size has great impact on the result. I ran a sample with the default target bin size of 5000 and when I ran cnvkit.py genemetrics sample.cnr it was reported that there was 0 gene-level gains or losses found. However, for the same sample, when I reduced the bin-size to 1000, there were gains and losses reported, some with really negative log2 values.

I was just wondering, what would be a good choice of the bin-size. How can I ensure that I don't miss out important gains and losses and at the same time minimize false positives. Will the calculated default always be a reasonable choice? Or is there any other factors that I should take into consideration?

I hope the question is clear. Thank you.

genemetrics cnvkit • 463 views
ADD COMMENTlink modified 8 months ago by Chris Miller21k • written 8 months ago by wei.wei10

Please re frame your question.

How To Ask Good Questions On Technical And Scientific Forums

ADD REPLYlink written 8 months ago by lakhujanivijay4.5k
1
gravatar for Chris Miller
8 months ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

It's a tradeoff. Larger bin sizes reduce noise, smaller bin sizes increase sensitivity. This figure (panels C and D) may offer some clarity on that point, as does some of the text immediately below https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0016327#pone-0016327-g001

In general, given a 30x whole genome, 10k is a good size for high confidence calls, but 1k can also be perfectly acceptable if proper filtering is used.

ADD COMMENTlink written 8 months ago by Chris Miller21k

Hi Chris,

Thanks a lot. Just curious, I saw that readDepth is no longer available for the newer R versions, so is copyCat. Are there any other similar R packages that can be used in CNV calling? I'm using yeast genome.

ADD REPLYlink written 8 months ago by wei.wei10

CopyCat should work fine with newer R versions, as far as I know, and has a single-sample mode

ADD REPLYlink written 8 months ago by Chris Miller21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1387 users visited in the last hour