Error with frame/translation in BioPython?
0
0
Entering edit mode
4.8 years ago

I want to create a program that translates DNA sequences into amino acid sequences. So I used Biopython. here's what i have so far

import sys
from Bio import SeqIO
import warnings
warnings.filterwarnings("ignore")
filename = raw_input("Enter filename : ")
myfile = open(filename)
fasta_in=filename

for record in SeqIO.parse(fasta_in,'fasta'):
     id_part=record.id
     desc_part=record.description
     seq = record.seq

     print('seq:',seq)

print(seq);
frame_num = raw_input("Enter frame number: ")
if (frame_num[0] == '+'):
   for codon in [seq[i:i+3] for i in range(int(frame_num[1])-1, len(seq), 3)]:
      print seq.translate()
      sys.exit()



else:
for codon in [seq[i-3:i] for i in range(len(seq)-int(frame_num[1])+1, 0, -3)]:
    print seq.translate()
    sys.exit

I ran this program, but the same amino acid sequence translation was printed at +1 and -2. Like this:

Enter filename : ringo.txt
('seq:', Seq('TTCAGGAGTGGAACGCACGCCAGCGACGTCCAAGAAGCCTTGAAACAGTTCGTC...TTT', SingleLetterAlphabet()))
('seq:', Seq('ATGGGAAGAAGGCGAAGTCATGAGCGCCGGGATTTACCCCCTAACCTTTATATA...TAA', SingleLetterAlphabet()))ATGGGAAGAAGGCGAAGTCATGAGCGCCGGGATTTACCCCCTAACCTTTATATAAGAAACAATGGATATTACTGCTACAGGGACCCAAGGACGGGTAAAGAGTTTGGATTAGGCAGAGACAGGCGAATCGCAATCACTGAAGCTATACAGGCCAACATTGAGTTATTTTCAGGACACAAACACAAGCCTCTGACAGCGAGAATCAACAGTGATAATTCCGTTACGTTACATTCATGGCTTGATCGCTACGAAAAAATCCTGGCCAGCAGAGGAATCAAGCAGAAGACACTCATAAATTACATGAGCAAAATTAAAGCAATAAGGAGGGGTCTGCCTGATGCTCCACTTGAAGACATCACCACAAAAGAAATTGCGGCAATGCTCAATGGATACATAGACGAGGGCAAGGCGGCGTCAGCCAAGTTAATCAGATCAACACTGAGCGATGCATTCCGAGAGGCAATAGCTGAAGGCCATATAACAACAAACCATGTCGCTGCCACTCGCGCAGCAAAATCAGAGGTAAGGAGATCAAGACTTACGGCTGACGAATACCTGAAAATTTATCAAGCAGCAGAATCATCACCATGTTGGCTCAGACTTGCAATGGAACTGGCTGTTGTTACCGGGCAACGAGTTGGTGATTTATGCGAAATGAAGTGGTCTGATATCGTAGATGGATATCTTTATGTCGAGCAAAGCAAAACAGGCGTAAAAATTGCCATCCCAACAGCATTGCATATTGATGCTCTCGGAATATCAATGAAGGAAACACTTGATAAATGCAAAGAGATTCTTGGCGGAGAAACCATAATTGCATCTACTCGTCGCGAACCGCTTTCATCCGGCACAGTATCAAGGTATTTTATGCGCGCACGAAAAGCATCAGGTCTTTCCTTCGAAGGGGATCCGCCTACCTTTCACGAGTTGCGCAGTTTGTCTGCAAGACTCTATGAGAAGCAGATAAGCGATAAGTTTGCTCAACATCTTCTCGGGCATAAGTCGGACACCATGGCATCACAGTATCGTGATGACAGAGGCAGGGAGTGGGACAAAATTGAAATCAAATAA
Enter frame number: +1


MGRRRSHERRDLPPNLYIRNNGYYCYRDPRTGKEFGLGRDRRIAITEAIQANIELFSGHKHKPLTARINSDNSVTLHSWLDRYEKILASRGIKQKTLINYMSKIKAIRRGLPDAPLEDITTKEIAAMLNGYIDEGKAASAKLIRSTLSDAFREAIAEGHITTNHVAATRAAKSEVRRSRLTADEYLKIYQAAESSPCWLRLAMELAVVTGQRVGDLCEMKWSDIVDGYLYVEQSKTGVKIAIPTALHIDALGISMKETLDKCKEILGGETIIASTRREPLSSGTVSRYFMRARKASGLSFEGDPPTFHELRSLSARLYEKQISDKFAQHLLGHKSDTMASQYRDDRGREWDKIEIK*

And '-2'

Enter filename : ringo.txt
('seq:', Seq('TTCAGGAGTGGAACGCACGCCAGCGACGTCCAAGAAGCCTTGAAACAGTTCGTC...TTT', SingleLetterAlphabet()))
('seq:', Seq('ATGGGAAGAAGGCGAAGTCATGAGCGCCGGGATTTACCCCCTAACCTTTATATA...TAA', SingleLetterAlphabet()))ATGGGAAGAAGGCGAAGTCATGAGCGCCGGGATTTACCCCCTAACCTTTATATAAGAAACAATGGATATTACTGCTACAGGGACCCAAGGACGGGTAAAGAGTTTGGATTAGGCAGAGACAGGCGAATCGCAATCACTGAAGCTATACAGGCCAACATTGAGTTATTTTCAGGACACAAACACAAGCCTCTGACAGCGAGAATCAACAGTGATAATTCCGTTACGTTACATTCATGGCTTGATCGCTACGAAAAAATCCTGGCCAGCAGAGGAATCAAGCAGAAGACACTCATAAATTACATGAGCAAAATTAAAGCAATAAGGAGGGGTCTGCCTGATGCTCCACTTGAAGACATCACCACAAAAGAAATTGCGGCAATGCTCAATGGATACATAGACGAGGGCAAGGCGGCGTCAGCCAAGTTAATCAGATCAACACTGAGCGATGCATTCCGAGAGGCAATAGCTGAAGGCCATATAACAACAAACCATGTCGCTGCCACTCGCGCAGCAAAATCAGAGGTAAGGAGATCAAGACTTACGGCTGACGAATACCTGAAAATTTATCAAGCAGCAGAATCATCACCATGTTGGCTCAGACTTGCAATGGAACTGGCTGTTGTTACCGGGCAACGAGTTGGTGATTTATGCGAAATGAAGTGGTCTGATATCGTAGATGGATATCTTTATGTCGAGCAAAGCAAAACAGGCGTAAAAATTGCCATCCCAACAGCATTGCATATTGATGCTCTCGGAATATCAATGAAGGAAACACTTGATAAATGCAAAGAGATTCTTGGCGGAGAAACCATAATTGCATCTACTCGTCGCGAACCGCTTTCATCCGGCACAGTATCAAGGTATTTTATGCGCGCACGAAAAGCATCAGGTCTTTCCTTCGAAGGGGATCCGCCTACCTTTCACGAGTTGCGCAGTTTGTCTGCAAGACTCTATGAGAAGCAGATAAGCGATAAGTTTGCTCAACATCTTCTCGGGCATAAGTCGGACACCATGGCATCACAGTATCGTGATGACAGAGGCAGGGAGTGGGACAAAATTGAAATCAAATAA
Enter frame number: -2
MGRRRSHERRDLPPNLYIRNNGYYCYRDPRTGKEFGLGRDRRIAITEAIQANIELFSGHKHKPLTARINSDNSVTLHSWLDRYEKILASRGIKQKTLINYMSKIKAIRRGLPDAPLEDITTKEIAAMLNGYIDEGKAASAKLIRSTLSDAFREAIAEGHITTNHVAATRAAKSEVRRSRLTADEYLKIYQAAESSPCWLRLAMELAVVTGQRVGDLCEMKWSDIVDGYLYVEQSKTGVKIAIPTALHIDALGISMKETLDKCKEILGGETIIASTRREPLSSGTVSRYFMRARKASGLSFEGDPPTFHELRSLSARLYEKQISDKFAQHLLGHKSDTMASQYRDDRGREWDKIEIK*

I want to make program like this

Ask the user for a file name in FASTA format. Open the file (while checking for errors. Input the frame number (+1, +2, +3, -1, -2, -3) from the user. Output the amino acid sequence when read from the input frame.

I don't know what to fix. If you know a good solution, please let me know.

python biopython • 1.2k views
ADD COMMENT
0
Entering edit mode

Hi, is this a homework/assignment question?

ADD REPLY
0
Entering edit mode

Just FYI, OP, I’ve edited your tags (the # are not necessary) to make it easier for people watching the tags to find your post, and I’ve altered your title to be a bit more specific to the question. If you would like to rephrase my change, please feel free, but try to keep the title descriptive of the thread content.

ADD REPLY
0
Entering edit mode

BioPython is smart enough that you don’t need to get your hands quite so dirty with the codons and frames.

There is a whole module of sequence utilities which includes a 6 frame translator, so you can translate everything at once:

https://biopython.org/DIST/docs/api/Bio.SeqUtils-module.html#six_frame_translations

ADD REPLY
0
Entering edit mode

While I like that util, I think its main purpose is to prettify things. I find it difficult to work with the string representation of the translations.

ADD REPLY
0
Entering edit mode

True, though OP could probably extract the relevant bits of the source for their problem. Looping over each triplet individually to translate them doesn’t strike me as efficient/robust/necessary

ADD REPLY

Login before adding your answer.

Traffic: 3296 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6