Symbolic alternate allele in the VCF ALT field
1
2
Entering edit mode
5.2 years ago
tsimakova ▴ 20

Hello,

The alternate allele in the VCF is expressed as a sequence of one or more A/C/G/T nucleotides. According to the VCF 4.1 specification, symbolic alternate alleles is used for imprecise structural variants (e.g. <DEL>, <DUP> and so on). I'm working with manually generated VCF files. So i'm interested is it possible to use symbolic alternate alleles for long deletions/insertions with known breakpoints? And if it's enabled, is there are any recommendations on the minimal variant length for using a symbolic alternate allele?  

Thanks,

Tamara  

CNV VCF ALT • 2.7k views
ADD COMMENT
0
Entering edit mode
5.1 years ago
Eric T. ★ 2.6k

Yes, you can use symbolic alternate alleles for precise structural variants, too. Here's the VCF 4.2 specification. In short, you put the start position in the POS column and the end position in the INFO column using the "END=1234..." flag.

For minimum variant length, it makes sense to only show indel sequences in the ALT column if the exact sequence is known, which I suppose is most reliable (and easiest to phase) if the indel is entirely contained within a single mapped read, rather than spanning a read pair. So, if your read length is 100bp, a cutoff of 50bp for indel sequences is reasonable, and you can represent anything larger as a symbolic alt allele in your VCF file.

ADD COMMENT

Login before adding your answer.

Traffic: 2338 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6