I have a question regarding CNV analysis (new CNVkit user). I am analyzing tumor samples from whole exome sequencing using CNVkit (v 0.9.3). I followed the workflow from the CNV tutorial:
- First, I run the 'batch' command on all my normal samples to generate a pooled reference (n = 50 normal samples). - Next, using the pool reference I call copy number information from my tumor samples (n > 700 tumor samples). - Then, I run the 'metrics' command to evaluate quality of samples, inspect the coverages and remove tumor samples that had extremely high segmentation (all looked ok).
Note, I am using a pooled normal as a reference, since my samples don't have a matched normal control.
I was able to generate the scatter plot from the 'call' command output using the default parameters (attached). As you can see the scatter plot is very noisy. Also, for you reference, I tried increasing the bin size in the 'batch' command to see if this could help in reducing the noise, but this didn't make any difference in the level of the noise.
My question is do you have any suggestion for dealing with noisy samples such as this?
Many thanks for your help in advance! Regards Fil ![Scatter plot from 'call' output (default parmaters)]