Question: PyVCF error: KeyError: 'ANN' when appending annotation fields to a list
gravatar for spiral01
2.5 years ago by
spiral01100 wrote:

I have downloaded the 1000 genomes phase 3 vcf file for chromosome 1 and annotated it using snpEff. I am now trying to parse the annotated file to create a new text file with only the data I need. My issue is when I try to parse the annotation field. My code for this bit is as below:

   tempList = []
   vcf_reader = vcf.Reader(open('/ann.chr1.vcf', 'r'))
   for record in vcf_reader:
        annList = [i.split('|') for i in record.INFO['ANN']]

This runs but I get the error:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
KeyError: 'ANN'

I have tried running this same code on a much smaller file (I literally just took a few lines from the vcf file as well as the metadata to create a small file to test on) and it runs fine, but when I run it on the full vcf file I am getting this error. I have even tried appending the whole 'ANN' field to a list using:

for record in vcf_reader:

This works fine for all other fields (e.g. record.INFO['CHROM']) but I get the same error when it comes to the 'ANN' field. The code does run for a bit and I have checked the length of the list, but it is different everytime I run this code, indicating it is stopping at different points each time. As such, I really am not sure what is going on here. Thanks.

snp pyvcf python • 760 views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by spiral01100

Try debugging with a try-except statement to figure out on which lines it goes wrong, then look at those lines:

for record in vcf_reader:
    except KeyError:
ADD REPLYlink written 2.5 years ago by WouterDeCoster42k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 877 users visited in the last hour