Question: .rnt and .ptt file from genbank
0
gravatar for Molinia
3 months ago by
Molinia0
Molinia0 wrote:

Hello everyone ! I have a problem with my script to convert genbank file to rnt and ptt file I get inspired by an other post Question : Problem combining ptt, rnt files and I've made some adjustements.

The problem is that when I launch the script for example on NC_008253 it writes that there is no length, no genes and so on.... whereas when I'm printing the value for example the gene 'f.qualifiers['gene'][0]' some are printed and then a KeyError is raised... That's why I tried something on my script to bypass the error

Here it is :

 #!/usr/bin/python3
 import re
 from Bio import SeqIO

 annotation_file = "NC_004431.gb"  # .gbk file
 rnt_file = "NC_004431.rnt"  # .rnt file
 ptt_file = "NC_004431.ptt"  # .ptt file
 r = SeqIO.parse(annotation_file, "gb")
 fasta_file = "NC_004431.fna"

 records = []
 with open(annotation_file) as inpf:
    for rec in SeqIO.parse(inpf, 'genbank'):
       records.append(rec)

 SeqIO.write(records, fasta_file, 'fasta')


 # RNT FILE DEF
 for record in r:
     fasta_file = open(fasta_file, "a")
     SeqIO.write(record, fasta_file, "fasta")
     fasta_file.close()

     record.features = [f for f in record.features if f.type == "rRNA" or f.type == "tRNA"]
     fout = open(rnt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} RNAs\n".format(len(record.features)))
     fout.write("Location\tStrand\tLength\tPID\tGene\tSynonym\tCode\tCOG\tProduct\n")
     strand = {1: '+', -1: '-'}
     for f in record.features:
         try:
             #print(f.qualifiers['gene'])
             fout.write( "{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end),strrand[f.strand], str(abs(f.location.start - f.location.end)), str(record.annotations['gi']), f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))

          except KeyError:
             fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand[f.strand], str(abs(f.location.start - f.location.end)), '-', '-', f.qualifiers["locus_tag"][0], '-', '-', f.qualifiers["product"][0]])))

     fout.close()

 # PTT FILE
 r = SeqIO.parse(annotation_file, "gb")
 for record in r:
     record.features = [f for f in record.features if f.type == "CDS"]
     fout = open(ptt_file, "a")
     fout.write("{0} - 1..{1}\n".format(record.description, len(record)))
     fout.write("{0} proteins\n".format(len(record.features)))
     fout.write("Location\tStrand\tLength\tPID\tGene\tSynonym\tCode\tCOG\tProduct\n")
     for f in record.features:
         strand = str(f.strand)
         try:
             product = f.qualifiers["product"][0]
             PID = re.sub('[GI:]', "", str(f.qualifiers['db_xref'][0]))
             translation = f.qualifiers['translation'][0]
             for element in translation:
                 fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand, str(len(element)), PID, f.qualifiers['gene'][0], f.qualifiers["locus_tag"][0], "-", "-", product])))

           except KeyError:
                PID = '-'
                gene = '-'
                product = '-'
                fout.write("{0}\n".format("\t".join([str(abs(f.location.start + 1)) + ".." + str(f.location.end), strand,"-", PID, gene, f.qualifiers["locus_tag"][0], "-", "-", product])))

     fout.close()

I've made an error somewhere but I can't find

Thanks in advance !!

genbank ptt python rnt • 169 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Molinia0
1

This may not be what you are looking for since it doesn't address the problem with your script, but there are BioPython and BioPerl solutions available.

https://github.com/HenryWiersma/gbkToPttAndRnt

https://github.com/sgivan/gb2ptt

https://github.com/ajvilleg/gbk2ptt

ADD REPLYlink written 3 months ago by Mensur Dlakic5.8k

Thanks for the answer but I have already checked this script

ADD REPLYlink written 3 months ago by Molinia0

The problem is that it only write the 'except KeyERROR' part

ADD REPLYlink modified 3 months ago • written 3 months ago by Molinia0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1443 users visited in the last hour