Question: .rnt and .ptt file from genbank
gravatar for Molinia
3 months ago by
Molinia0 wrote:

Hello everyone ! I have a problem with my script to convert genbank file to rnt and ptt file I get inspired by an other post Question : Problem combining ptt, rnt files and I've made some adjustements.

The problem is that when I launch the script for example on NC_008253 it writes that there is no length, no genes and so on.... whereas when I'm printing the value for example the gene 'f.qualifiers['gene'][0]' some are printed and then a KeyError is raised... That's why I tried something on my script to bypass the error

Here it is :

 import re
 from Bio import SeqIO

 annotation_file = ""  # .gbk file
 rnt_file = "NC_004431.rnt"  # .rnt file
 ptt_file = "NC_004431.ptt"  # .ptt file
 r = SeqIO.parse(annotation_file, "gb")
 fasta_file = "NC_004431.fna"

 records = []
 with open(annotation_file) as inpf:
    for rec in SeqIO.parse(inpf, 'genbank'):

 SeqIO.write(records, fasta_file, 'fasta')

 for record in r:
     fasta_file = open(fasta_file, "a")
     SeqIO.write(record, fasta_file, "fasta")

     record.features = [f for f in record.features if f.type == "rRNA" or f.type == "tRNA"]
     fout = open(rnt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} RNAs\n".format(len(record.features)))
     strand = {1: '+', -1: '-'}
     for f in record.features:
             fout.write( "{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end),strrand[f.strand], str(abs(f.location.start - f.location.end)), str(record.annotations['gi']), f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))

          except KeyError:
             fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand[f.strand], str(abs(f.location.start - f.location.end)), '-', '-', f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))


 r = SeqIO.parse(annotation_file, "gb")
 for record in r:
     record.features = [f for f in record.features if f.type == "CDS"]
     fout = open(ptt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} proteins\n".format(len(record.features)))
     for f in record.features:
         strand = str(f.strand)
             product = f.qualifiers["product"][0]
             PID = re.sub('[GI:]', "", str(f.qualifiers['db_xref'][0]))
             translation = f.qualifiers['translation'][0]
             for element in translation:
                 fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand, str(len(element)), PID, f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], "-", "-", product])))

           except KeyError:
                PID = '-'
                gene = '-'
                product = '-'
                fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand,"-", PID, gene, f.qualifiers["locus_tag"][0], "-", "-", product])))


I've made an error somewhere but I can't find

Thanks in advance !!

genbank ptt python rnt • 169 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Molinia0

This may not be what you are looking for since it doesn't address the problem with your script, but there are BioPython and BioPerl solutions available.

ADD REPLYlink written 3 months ago by Mensur Dlakic5.8k

Thanks for the answer but I have already checked this script

ADD REPLYlink written 3 months ago by Molinia0

The problem is that it only write the 'except KeyERROR' part

ADD REPLYlink modified 3 months ago • written 3 months ago by Molinia0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1443 users visited in the last hour