Hi,
I am planning to use some normal samples for CNV analysis, but I had first wanted to check whether the normal samples themselves are "normal" or if they might contain major CNV's themselves / aren't very normal. To do this task, I am currently using CNVkit
.
As mentioned in the documentation, I've built a "flat" reference using the created target
/antitarget
BED files. I've then run cnvkit.py batch *_normals.bam -r flat_reference.cnn
to analyse the normal samples. This gives me sample.cnr
, sample.targetcoverage.cnn
and sample.antitargetcoverage.cnn
for outputs. The issue here is that the plots I get out from scatter
seem to be very ... noisy (a lot gray dots all over the place). Would I need to run cnvkit.py segments
to obtain the .cns
files as well, or are they not relevant in this case (it seems batch didn't output them)? I've additionally read that cnvkit.py metrics
might give some information but I am unsure of how exactly to interpret the output.
If I do run segments
on each of the normals and then plot them with scatter
, the resulting plot shows mostly an orange line at 0, with some of the orange dots here and there being slightly above or below (though in one of the normal samples, there is a single dot located at at like -8 on the y-axis). Would this indicate that the samples are relatively normal (against a flat reference)?
(Also it seems that the batch
command doesn't seem to be outputting the segments/cns
files at all? I saw GitHub issues about it but no solution posted yet, version is 0.9.9
)
Any help is appreciated!