Hi there,
I am working with a VCF that includes Copy Number Variants and hoped to confirm that I am reading the data correctly.
22 16050654 DUP_gs_CNV_22_16050654_16063474 A <CN0>,<CN2>,<CN3>,<CN4> 100 PASS AC=9,87,599,20;AF=0.00179712,0.0173722,0.119609,0.00399361;AN=5008;CS=DUP_gs;END=16063474;NS=2504;SVTYPE=CNV GT 3|0 0|0 0|0 0|0 0|0
From the line above extracted from my VCF I am assuming the following:
- Samples with the 0 genotype have an A at position Chromosome 22 Position 16050654
- For samples exhibiting a CNV the copied block runs from Position 16050654 to 16063474
- Samples with, for example, the 3 genotype have
<CN3>
copies of this block <CN3>
means the block is copied 3 times
The final assumption is my main concern, does <CNx>
always mean there are x copies of the block?
If anyone can confirm that I am reading this correctly or otherwise set me straight I would be most grateful!
As an aside, as this is my first post, I'd love to thank the Biostars community for all the ongoing help and expertise! You guys are great!
Cheers
MrGraeme
Ah! That's great, I didn't realise VCF's header provided that kind of detail...
Many thanks!
MrGraeme