I need some help with understanding the output of CNVkit, specifically Segmented log2 ratios (.cns) and the exported CNVs in VCF format.
I'm looking to get the copy number of every region found by CNVkit.
For Segmented log2 ratios (.cns) file:
Please correct me if i'm wrong: To get the actual estimated copy number I should simply anti-log the log2 column, right?
In that case,
What's the correlation/connection, if there is any, between the log2 value in the .cns file and the inferred SVLEN value in the .vcf file?
Is there any connection to the "CN" (copy number genotype...) value in the .vcf? Also, why does "CN" only appears in duplication events? I tried calculating the copy number from the log2 value in the .cns file but the values are different than what I expected.
An example if it helps:
For a certain region in the .cns file, the log2 value is 0.382191.
In the .vcf file, SVLEN is 6812878 and the CN value is 3. What is the copy number then?