how to fix old VCF headers
0
0
Entering edit mode
5.6 years ago
Richard ▴ 590

Hi folks,

I'm working with some older VCF files and trying to get them to work with new Picard and GATK commands. I'm running in to a lot of these sorts of errors...

Unable to parse header with error: Invalid count number, with fixed count the number should be 1 or higher: key=INFO name=NMD.GENE

I've seen these sorts of errors in a lot of fields and I'm going through one by one trying to fix each one. However, the list is growing to the point that I though there may be a tool out there that can fix all of these erorrs. I've seen the same or similar error with these fields in my VCF file, each of them complaining about Number=0 fields in the VCF headers.

  • PL
  • AA_LEN
  • NMD.NUMTR
  • EFF.EFFECT
  • SVLEN
  • NMD.GENE
  • NMD.GENEID
  • EFF.EXID
  • EFF.BIOTYPE
  • CNADJ
  • (the list goes on)

I found the Picard FixVcfHeader command but perhaps ironically it also fails on the same errors. Are there any commands that I could apply to my VCFs to be able to use them with commands such as "CollectAllelicCounts" in GATK?

VCF • 2.2k views
ADD COMMENT
0
Entering edit mode

This seems to get me around the errors, but certainly isn't the best option

cat my.vcf | sed 's/PL,Number=-1/PL,Number=A/' | sed 's/Number=0/Number=A/' > myClean.vcf
ADD REPLY

Login before adding your answer.

Traffic: 2021 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6