how to get the nucleotide squence behind a symbolic ALT in VCF
0
0
Entering edit mode
13 months ago
Maxine ▴ 40

I got a VCF that consists of structural variations via sniffles2. I find the contents in the ALT column sometimes are symbolic SVs, such as <INS> and , instead of nucleotide sequences. The VCF is like this:

#CHROM      POS   ID    REF   ALT   QUAL  FILTER      INFO  FORMAT      cy201704    cy201804    cy201904    cy202304
NC_058089.1 79333062    Sniffles2.INS.DF23M9    T     <INS> 37    PASS  PRECISE;SVTYPE=INS;SVLEN=7772;END=79333062;SUPPORT=8;COVERAGE=24,25,26,28,37;STRAND=+;AC=2;STDEV_LEN=0;STDEV_POS=0;SUPP_VEC=001000000000  GT:GQ:DR:DV:ID    0/0:0:22:0:NULL   0/0:0:31:0:NULL   1/1:60:0:40:Sniffles2.INS.13174S9   0/0:0:21:0:NULL

However, the nucleotide sequences are what I need for downstream analysis. Did anybody meet a similar problem before?

SV vcf sniffles • 318 views
ADD COMMENT

Login before adding your answer.

Traffic: 1825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6