Question: Convert a FASTA Amino Acid Sequence to RNA (Reverse Central Dogma)
gravatar for incensefrenzie2006
9 months ago by
incensefrenzie20060 wrote:

I see lots of example on how to convert DNA to RNA or RNA to Amino. I see plenty examples using Python and Biopython. How could I do the reverse? Amino acid sequence to RNA and then RNA to DNA.

Define a dict that maps Amino acids to the corresponding codon

AA_codon = {
'C': ['TGT', 'TGC'], 
'A': ['GAT', 'GAC'], 
'S': ['TCT', 'TCG', 'TCA', 'TCC', 'AGC', 'AGT'], 
'G': ['CAA', 'CAG'], 
 'M': ['ATG'], #Start
 'A': ['AAC', 'AAT'], 
 'P': ['CCT', 'CCG', 'CCA', 'CCC'], 
 'L': ['AAG', 'AAA'], 
 'Q': ['TAG', 'TGA', 'TAA'], #Stop
 'T': ['ACC', 'ACA', 'ACG', 'ACT'], 
 'P': ['TTT', 'TTC'], 
 'A': ['GCA', 'GCC', 'GCG', 'GCT'], 
 'G': ['GGT', 'GGG', 'GGA', 'GGC'], 
 'I': ['ATC', 'ATA', 'ATT'], 
 'L': ['TTA', 'TTG', 'CTC', 'CTT', 'CTG', 'CTA'], 
 'H': ['CAT', 'CAC'], 
 'A': ['CGA', 'CGC', 'CGG', 'CGT', 'AGG', 'AGA'], 
 'T': ['TGG'], 
 'V': ['GTA', 'GTC', 'GTG', 'GTT'], 
 'G': ['GAG', 'GAA'], 
 'T': ['TAT', 'TAC'] }

ReverseTranslate(): Read over each character in string & join

sequence python • 548 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by incensefrenzie20060

Basically, it's because there are so many possible combinations, and so much redundancy, that the number of possible sequences that give rise to a particular amino acid, is too big to be useful for anything downstream. It would also be difficut to even represent the data in a useful way.

ADD REPLYlink written 9 months ago by jrj.healey4.6k

You can choose a frequency table, and sort of do a codon optimization. I say sort of, because you won't be taking into account PTMs, or other risks.

ADD REPLYlink modified 9 months ago • written 9 months ago by

I was under the impression that Biopython had method(s) to work this kind of problem, but I was not considering the "redundancy" of the codon table.

ADD REPLYlink written 9 months ago by incensefrenzie20060
gravatar for Rob
9 months ago by
United States
Rob2.3k wrote:

The key point you should note here is that a given codon implies a single amino acid, but a particular amino acid could result from multiple different codons. The reverse operation does not have a unique solution. Consider an amino acid sequence a_1, a_2, ..., a_n, where c_1, c_2, ..., c_n are the number of possible codons for each amino acid in this sequence in turn --- then there are \prod_{i=1}^{n} c_i possible different ways to generate the given amino acid sequence in terms of nucleotide sequences; this is exponential growth, and enumerating all possibilities won't be tractable for even reasonably long sequences.

ADD COMMENTlink written 9 months ago by Rob2.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 701 users visited in the last hour