Question: Convert a FASTA Amino Acid Sequence to RNA (Reverse Central Dogma)
gravatar for incensefrenzie2006
19 months ago by
incensefrenzie20060 wrote:

I see lots of example on how to convert DNA to RNA or RNA to Amino. I see plenty examples using Python and Biopython. How could I do the reverse? Amino acid sequence to RNA and then RNA to DNA.

Define a dict that maps Amino acids to the corresponding codon

AA_codon = {
'C': ['TGT', 'TGC'], 
'A': ['GAT', 'GAC'], 
'S': ['TCT', 'TCG', 'TCA', 'TCC', 'AGC', 'AGT'], 
'G': ['CAA', 'CAG'], 
 'M': ['ATG'], #Start
 'A': ['AAC', 'AAT'], 
 'P': ['CCT', 'CCG', 'CCA', 'CCC'], 
 'L': ['AAG', 'AAA'], 
 'Q': ['TAG', 'TGA', 'TAA'], #Stop
 'T': ['ACC', 'ACA', 'ACG', 'ACT'], 
 'P': ['TTT', 'TTC'], 
 'A': ['GCA', 'GCC', 'GCG', 'GCT'], 
 'G': ['GGT', 'GGG', 'GGA', 'GGC'], 
 'I': ['ATC', 'ATA', 'ATT'], 
 'L': ['TTA', 'TTG', 'CTC', 'CTT', 'CTG', 'CTA'], 
 'H': ['CAT', 'CAC'], 
 'A': ['CGA', 'CGC', 'CGG', 'CGT', 'AGG', 'AGA'], 
 'T': ['TGG'], 
 'V': ['GTA', 'GTC', 'GTG', 'GTT'], 
 'G': ['GAG', 'GAA'], 
 'T': ['TAT', 'TAC'] }

ReverseTranslate(): Read over each character in string & join

sequence python • 1.1k views
ADD COMMENTlink modified 19 months ago • written 19 months ago by incensefrenzie20060

Basically, it's because there are so many possible combinations, and so much redundancy, that the number of possible sequences that give rise to a particular amino acid, is too big to be useful for anything downstream. It would also be difficut to even represent the data in a useful way.

ADD REPLYlink written 19 months ago by jrj.healey12k

You can choose a frequency table, and sort of do a codon optimization. I say sort of, because you won't be taking into account PTMs, or other risks.

ADD REPLYlink modified 19 months ago • written 19 months ago by

I was under the impression that Biopython had method(s) to work this kind of problem, but I was not considering the "redundancy" of the codon table.

ADD REPLYlink written 19 months ago by incensefrenzie20060
gravatar for Rob
19 months ago by
United States
Rob3.3k wrote:

The key point you should note here is that a given codon implies a single amino acid, but a particular amino acid could result from multiple different codons. The reverse operation does not have a unique solution. Consider an amino acid sequence a_1, a_2, ..., a_n, where c_1, c_2, ..., c_n are the number of possible codons for each amino acid in this sequence in turn --- then there are \prod_{i=1}^{n} c_i possible different ways to generate the given amino acid sequence in terms of nucleotide sequences; this is exponential growth, and enumerating all possibilities won't be tractable for even reasonably long sequences.

ADD COMMENTlink written 19 months ago by Rob3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1883 users visited in the last hour