Question: adding custom INFO tag to vcf
0
gravatar for bioguy24
2.0 years ago by
bioguy24190
Chicago
bioguy24190 wrote:

I am trying to add a few custom INFO tags to a vcf 4.1. The below vcf is what I have in which the last 4 tab-delimited fields (GOOD 103 hom 16 and GOOD 139 het 8) are not defined in the INFO tags.

My thought (though probably not the best) was too add 4 INFO tags for these:

sed -i '10i\##INFO=<ID=,Type=Integer,Description="Variant quality">\'
sed -i '11i\##INFO=<ID=,Type=String,Description="Reads">\'
sed -i '12\##INFO=<ID=,String,Type=Float,Description="Zygosity">\'
sed -i '13i\##INFO=<ID=,Type=Integer,Description="Score">\'

I am not sure if the above will work or not as each one of the 4 fields does not have an ID=.

VCF:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  xxxx
chr1    948846  .   T   TA  529.927 PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;  FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395  GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    GOOD    103 hom 16
chr1    948870  .   C   G   279.296 PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678   GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1  GOOD    139 het 8

desired vcf:

##INFO=<ID=,Type=Integer,Description="Variant quality">\'
##INFO=<ID=,Type=String,Description="Reads">\'
##INFO=<ID=,String,Type=Float,Description="Zygosity">\'
##INFO=<ID=,Type=Integer,Description="Score">\'
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  xxxx
chr1    948846  .   T   TA  529.927 PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;  FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395  GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    GOOD    103 hom 16
chr1    948870  .   C   G   279.296 PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678   GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1  GOOD    139 het 8
ngs vcf • 919 views
ADD COMMENTlink modified 2.0 years ago by Pierre Lindenbaum120k • written 2.0 years ago by bioguy24190
3
gravatar for Pierre Lindenbaum
2.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

I don't think you can have an empty ID in a VCF info and

<ID=,String,Type=Flo loooks really wrong. Anyway.

the following awk script will add the 4 info header when the VCF header is matched:

awk '/^#CHROM/ {printf("##INFO=<ID=ID1,Type=Integer,Description=\"Variant quality\">\n"); printf("##INFO=<ID=ID2,Type=String,Description=\"Reads\">\n"); printf("##INFO=<ID=ID3,String,Type=Float,Description=\"Zygosity\">\n"); printf("##INFO=<ID=ID4,Type=Integer,Description=\"Score\">\n"); } {print;} ' input.vcf
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Pierre Lindenbaum120k

Thank you very much :).

ADD REPLYlink written 2.0 years ago by bioguy24190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1269 users visited in the last hour