Question: how to fix old VCF headers
0
gravatar for Richard
9 months ago by
Richard560
Canada
Richard560 wrote:

Hi folks,

I'm working with some older VCF files and trying to get them to work with new Picard and GATK commands. I'm running in to a lot of these sorts of errors...

Unable to parse header with error: Invalid count number, with fixed count the number should be 1 or higher: key=INFO name=NMD.GENE

I've seen these sorts of errors in a lot of fields and I'm going through one by one trying to fix each one. However, the list is growing to the point that I though there may be a tool out there that can fix all of these erorrs. I've seen the same or similar error with these fields in my VCF file, each of them complaining about Number=0 fields in the VCF headers.

  • PL
  • AA_LEN
  • NMD.NUMTR
  • EFF.EFFECT
  • SVLEN
  • NMD.GENE
  • NMD.GENEID
  • EFF.EXID
  • EFF.BIOTYPE
  • CNADJ
  • (the list goes on)

I found the Picard FixVcfHeader command but perhaps ironically it also fails on the same errors. Are there any commands that I could apply to my VCFs to be able to use them with commands such as "CollectAllelicCounts" in GATK?

vcf • 354 views
ADD COMMENTlink written 9 months ago by Richard560

This seems to get me around the errors, but certainly isn't the best option

cat my.vcf | sed 's/PL,Number=-1/PL,Number=A/' | sed 's/Number=0/Number=A/' > myClean.vcf
ADD REPLYlink modified 9 months ago by finswimmer11k • written 9 months ago by Richard560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 614 users visited in the last hour