Question: how to fix old VCF headers
4 months ago by
Richard550 wrote:

Hi folks,

I'm working with some older VCF files and trying to get them to work with new Picard and GATK commands. I'm running in to a lot of these sorts of errors...

Unable to parse header with error: Invalid count number, with fixed count the number should be 1 or higher: key=INFO name=NMD.GENE

I've seen these sorts of errors in a lot of fields and I'm going through one by one trying to fix each one. However, the list is growing to the point that I though there may be a tool out there that can fix all of these erorrs. I've seen the same or similar error with these fields in my VCF file, each of them complaining about Number=0 fields in the VCF headers.

  • PL
  • AA_LEN
  • (the list goes on)

I found the Picard FixVcfHeader command but perhaps ironically it also fails on the same errors. Are there any commands that I could apply to my VCFs to be able to use them with commands such as "CollectAllelicCounts" in GATK?

vcf
written 4 months ago by Richard550

This seems to get me around the errors, but certainly isn't the best option

cat my.vcf | sed 's/PL,Number=-1/PL,Number=A/' | sed 's/Number=0/Number=A/' > myClean.vcf
written 4 months ago by Richard550
