Question: PyVCF error: KeyError: 'ANN' when appending annotation fields to a list
gravatar for spiral01
9 months ago by
spiral0170 wrote:

I have downloaded the 1000 genomes phase 3 vcf file for chromosome 1 and annotated it using snpEff. I am now trying to parse the annotated file to create a new text file with only the data I need. My issue is when I try to parse the annotation field. My code for this bit is as below:

   tempList = []
   vcf_reader = vcf.Reader(open('/ann.chr1.vcf', 'r'))
   for record in vcf_reader:
        annList = [i.split('|') for i in record.INFO['ANN']]

This runs but I get the error:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
KeyError: 'ANN'

I have tried running this same code on a much smaller file (I literally just took a few lines from the vcf file as well as the metadata to create a small file to test on) and it runs fine, but when I run it on the full vcf file I am getting this error. I have even tried appending the whole 'ANN' field to a list using:

for record in vcf_reader:

This works fine for all other fields (e.g. record.INFO['CHROM']) but I get the same error when it comes to the 'ANN' field. The code does run for a bit and I have checked the length of the list, but it is different everytime I run this code, indicating it is stopping at different points each time. As such, I really am not sure what is going on here. Thanks.

snp pyvcf python • 324 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by spiral0170

Try debugging with a try-except statement to figure out on which lines it goes wrong, then look at those lines:

for record in vcf_reader:
    except KeyError:
ADD REPLYlink written 9 months ago by WouterDeCoster26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1579 users visited in the last hour