Question: Convert a FASTA Amino Acid Sequence to RNA (Reverse Central Dogma)
gravatar for incensefrenzie2006
2.8 years ago by
incensefrenzie20060 wrote:

I see lots of example on how to convert DNA to RNA or RNA to Amino. I see plenty examples using Python and Biopython. How could I do the reverse? Amino acid sequence to RNA and then RNA to DNA.

Define a dict that maps Amino acids to the corresponding codon

AA_codon = {
'C': ['TGT', 'TGC'], 
'A': ['GAT', 'GAC'], 
'S': ['TCT', 'TCG', 'TCA', 'TCC', 'AGC', 'AGT'], 
'G': ['CAA', 'CAG'], 
 'M': ['ATG'], #Start
 'A': ['AAC', 'AAT'], 
 'P': ['CCT', 'CCG', 'CCA', 'CCC'], 
 'L': ['AAG', 'AAA'], 
 'Q': ['TAG', 'TGA', 'TAA'], #Stop
 'T': ['ACC', 'ACA', 'ACG', 'ACT'], 
 'P': ['TTT', 'TTC'], 
 'A': ['GCA', 'GCC', 'GCG', 'GCT'], 
 'G': ['GGT', 'GGG', 'GGA', 'GGC'], 
 'I': ['ATC', 'ATA', 'ATT'], 
 'L': ['TTA', 'TTG', 'CTC', 'CTT', 'CTG', 'CTA'], 
 'H': ['CAT', 'CAC'], 
 'A': ['CGA', 'CGC', 'CGG', 'CGT', 'AGG', 'AGA'], 
 'T': ['TGG'], 
 'V': ['GTA', 'GTC', 'GTG', 'GTT'], 
 'G': ['GAG', 'GAA'], 
 'T': ['TAT', 'TAC'] }

ReverseTranslate(): Read over each character in string & join

sequence python • 1.6k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by incensefrenzie20060

Basically, it's because there are so many possible combinations, and so much redundancy, that the number of possible sequences that give rise to a particular amino acid, is too big to be useful for anything downstream. It would also be difficut to even represent the data in a useful way.

ADD REPLYlink written 2.8 years ago by Joe17k

You can choose a frequency table, and sort of do a codon optimization. I say sort of, because you won't be taking into account PTMs, or other risks.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by

I was under the impression that Biopython had method(s) to work this kind of problem, but I was not considering the "redundancy" of the codon table.

ADD REPLYlink written 2.8 years ago by incensefrenzie20060
gravatar for Rob
2.8 years ago by
United States
Rob4.0k wrote:

The key point you should note here is that a given codon implies a single amino acid, but a particular amino acid could result from multiple different codons. The reverse operation does not have a unique solution. Consider an amino acid sequence a_1, a_2, ..., a_n, where c_1, c_2, ..., c_n are the number of possible codons for each amino acid in this sequence in turn --- then there are \prod_{i=1}^{n} c_i possible different ways to generate the given amino acid sequence in terms of nucleotide sequences; this is exponential growth, and enumerating all possibilities won't be tractable for even reasonably long sequences.

ADD COMMENTlink written 2.8 years ago by Rob4.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1554 users visited in the last hour