I've noticed custom symbolic alternate alleles for structural variants in a few VCFs. For example, the gnomAD SV VCF
...
##ALT=<ID=CTX,Description="Reciprocal chromosomal translocation">
...
12 60718971 gnomAD_v2_CTX_12_13 N <CTX> 999 PASS END=57020218;SVTYPE=CTX;CHR2=13;SVLEN=-1
and the 1000 genomes phase3 integrated call set
...
##ALT=<ID=CN2,Description="Copy number allele: 2 copies">
...
1 668630 esv3584976 G <CN2> 100 PASS AC=64;AF=0.0127796
The problem I'm having, though, is that the VCF 4.3 spec reads (emphasis added, 4.2 and 4.1 specs are similar):
1.4.5 Alternative allele field format
Symbolic alternate alleles are described as follows:
##ALT=<ID=type,Description=description>
Structural Variants
In symbolic alternate alleles for imprecise structural variants, the ID field indicates the type of structural variant, and can be a colon-separated list of types and subtypes. ID values are case sensitive strings and must not contain whitespace or angle brackets. The first level type must be one of the following:
- DEL Deletion relative to the reference
- INS Insertion of novel sequence relative to the reference
- DUP Region of elevated copy number relative to the reference
- INV Inversion of reference sequence
- CNV Copy number variable region (may be both deletion and duplication)
- BND Breakend
The CNV category should not be used when a more specific category can be applied. Reserved subtypes include:
- DUP:TANDEM Tandem duplication
- DEL:ME Deletion of mobile element relative to the reference
- INS:ME Insertion of a mobile element relative to the reference
The way I read that, the symbolic alternate allele must start with one of those ID values. i.e. <CTX>
should really be something like <BND:CTX>
and <CN2>
should be something like <CNV:CN2>
.
Are <CTX>
, <CN2>
, etc. actually invalid, or am I misunderstanding the spec?
Seems related to structural variants in gnomad and the VCF spec. Why is tabix/bcftools failing ?
Yes, I was just about to comment that that question (CTX in INFO/SVTYPE, which is apparently valid) is what led me down the path to this question (CTX in ALT, which may be invalid). Thanks.