.rnt and .ptt file from genbank
0
0
Entering edit mode
4.1 years ago
Molinia ▴ 10

Hello everyone ! I have a problem with my script to convert genbank file to rnt and ptt file I get inspired by an other post Question : Problem combining ptt, rnt files and I've made some adjustements.

The problem is that when I launch the script for example on NC_008253 it writes that there is no length, no genes and so on.... whereas when I'm printing the value for example the gene 'f.qualifiers['gene'][0]' some are printed and then a KeyError is raised... That's why I tried something on my script to bypass the error

Here it is :

 #!/usr/bin/python3
 import re
 from Bio import SeqIO

 annotation_file = "NC_004431.gb"  # .gbk file
 rnt_file = "NC_004431.rnt"  # .rnt file
 ptt_file = "NC_004431.ptt"  # .ptt file
 r = SeqIO.parse(annotation_file, "gb")
 fasta_file = "NC_004431.fna"

 records = []
 with open(annotation_file) as inpf:
    for rec in SeqIO.parse(inpf, 'genbank'):
       records.append(rec)

 SeqIO.write(records, fasta_file, 'fasta')


 # RNT FILE DEF
 for record in r:
     fasta_file = open(fasta_file, "a")
     SeqIO.write(record, fasta_file, "fasta")
     fasta_file.close()

     record.features = [f for f in record.features if f.type == "rRNA" or f.type == "tRNA"]
     fout = open(rnt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} RNAs\n".format(len(record.features)))
     fout.write("Location\tStrand\tLength\tPID\tGene\tSynonym\tCode\tCOG\tProduct\n")
     strand = {1: '+', -1: '-'}
     for f in record.features:
         try:
             #print(f.qualifiers['gene'])
             fout.write( "{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end),strrand[f.strand], str(abs(f.location.start - f.location.end)), str(record.annotations['gi']), f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))

          except KeyError:
             fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand[f.strand], str(abs(f.location.start - f.location.end)), '-', '-', f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))

     fout.close()

 # PTT FILE
 r = SeqIO.parse(annotation_file, "gb")
 for record in r:
     record.features = [f for f in record.features if f.type == "CDS"]
     fout = open(ptt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} proteins\n".format(len(record.features)))
     fout.write("Location\tStrand\tLength\tPID\tGene\tSynonym\tCode\tCOG\tProduct\n")
     for f in record.features:
         strand = str(f.strand)
         try:
             product = f.qualifiers["product"][0]
             PID = re.sub('[GI:]', "", str(f.qualifiers['db_xref'][0]))
             translation = f.qualifiers['translation'][0]
             for element in translation:
                 fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand, str(len(element)), PID, f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], "-", "-", product])))

           except KeyError:
                PID = '-'
                gene = '-'
                product = '-'
                fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand,"-", PID, gene, f.qualifiers["locus_tag"][0], "-", "-", product])))

     fout.close()

I've made an error somewhere but I can't find

Thanks in advance !!

python genbank rnt ptt • 1.4k views
ADD COMMENT
1
Entering edit mode

This may not be what you are looking for since it doesn't address the problem with your script, but there are BioPython and BioPerl solutions available.

https://github.com/HenryWiersma/gbkToPttAndRnt

https://github.com/sgivan/gb2ptt

https://github.com/ajvilleg/gbk2ptt

ADD REPLY
0
Entering edit mode

Thanks for the answer but I have already checked this script

ADD REPLY
0
Entering edit mode

The problem is that it only write the 'except KeyERROR' part

ADD REPLY

Login before adding your answer.

Traffic: 2110 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6