Question: PyVCF error: KeyError: 'ANN' when appending annotation fields to a list
0
gravatar for spiral01
8 weeks ago by
spiral0160
spiral0160 wrote:

I have downloaded the 1000 genomes phase 3 vcf file for chromosome 1 and annotated it using snpEff. I am now trying to parse the annotated file to create a new text file with only the data I need. My issue is when I try to parse the annotation field. My code for this bit is as below:

   tempList = []
   vcf_reader = vcf.Reader(open('/ann.chr1.vcf', 'r'))
   for record in vcf_reader:
        annList = [i.split('|') for i in record.INFO['ANN']]

This runs but I get the error:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
KeyError: 'ANN'

I have tried running this same code on a much smaller file (I literally just took a few lines from the vcf file as well as the metadata to create a small file to test on) and it runs fine, but when I run it on the full vcf file I am getting this error. I have even tried appending the whole 'ANN' field to a list using:

for record in vcf_reader:
    annList.appendrecord.INFO['ANN'])

This works fine for all other fields (e.g. record.INFO['CHROM']) but I get the same error when it comes to the 'ANN' field. The code does run for a bit and I have checked the length of the list, but it is different everytime I run this code, indicating it is stopping at different points each time. As such, I really am not sure what is going on here. Thanks.

snp pyvcf python • 162 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by spiral0160

Try debugging with a try-except statement to figure out on which lines it goes wrong, then look at those lines:

for record in vcf_reader:
    try:    
        annList.appendrecord.INFO['ANN'])
    except KeyError:
        print(record)
ADD REPLYlink written 8 weeks ago by WouterDeCoster20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 581 users visited in the last hour