Question: how to fix old VCF headers
0
gravatar for Richard
4 months ago by
Richard550
Canada
Richard550 wrote:

Hi folks,

I'm working with some older VCF files and trying to get them to work with new Picard and GATK commands. I'm running in to a lot of these sorts of errors...

Unable to parse header with error: Invalid count number, with fixed count the number should be 1 or higher: key=INFO name=NMD.GENE

I've seen these sorts of errors in a lot of fields and I'm going through one by one trying to fix each one. However, the list is growing to the point that I though there may be a tool out there that can fix all of these erorrs. I've seen the same or similar error with these fields in my VCF file, each of them complaining about Number=0 fields in the VCF headers.

  • PL
  • AA_LEN
  • NMD.NUMTR
  • EFF.EFFECT
  • SVLEN
  • NMD.GENE
  • NMD.GENEID
  • EFF.EXID
  • EFF.BIOTYPE
  • CNADJ
  • (the list goes on)

I found the Picard FixVcfHeader command but perhaps ironically it also fails on the same errors. Are there any commands that I could apply to my VCFs to be able to use them with commands such as "CollectAllelicCounts" in GATK?

vcf • 206 views
ADD COMMENTlink written 4 months ago by Richard550

This seems to get me around the errors, but certainly isn't the best option

cat my.vcf | sed 's/PL,Number=-1/PL,Number=A/' | sed 's/Number=0/Number=A/' > myClean.vcf
ADD REPLYlink modified 4 months ago by finswimmer9.8k • written 4 months ago by Richard550
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1008 users visited in the last hour