Is there a way to Scan and Correct bad vcf INFO values?
0
0
Entering edit mode
2.4 years ago
bwubb • 0

Greetings,

I am playing around with two vcf files I generated with GATKv3.7. Both were generated with the parameters -G Standard -G AS_Standard, but one was generated in GGA mode from the sites of the other. That GGA vcf has some AS_Standard values missing.

##INFO=<ID=AS_MQ,Number=A,Type=Float,Description="Allele-specific RMS Mapping Quality">

But for some nonspecific number of variants they are annotated AS_MQ; with no float given. I noticed this from an attempted merge of their shared variants using bcftools.

Not ready for type [0]: AS_MQ at 12133603 was the exact error.

I was able to merge after sed replacing any value or AS_MQ; with AS_MQ=.; but I was wondering if there was any tool/command that could have done this for me and for any potentially wrong/missing value in a vcf format aware manner? I can build in a manual check for that AS_MQ fix, but I am trying to "fool proof" it in case it ever happens in another annotation.

Thank you!

software error vcf INFO • 578 views
ADD COMMENT
0
Entering edit mode

GATK has a history of producing VCFs that are incompatible with other tools. Could you perhaps paste the VCF header and a few offending and non-offending lines so that we could try it at our end? When you paste it, highlight and wrap it with the 101 010 button

ADD REPLY

Login before adding your answer.

Traffic: 1936 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6