Question: Codon Frequency Table For Human Mitochondrial Genes
8.6 years ago by
Boston, MA USA
Larry_Parnell wrote:

I have looked at the Kazusa pages for codon frequency tables, but find nothing for human mitochondrial (MT) genes. Thus, does anyone know of a source where I can see the frequency with which each codon is used in genes encoded by the human mitochondria? Searches of the internet and literature turn up genetic code tables, this codon = this amino acid and so on, but no frequencies.

I am trying to assess the consequences of synonymous SNPs in protein-coding MT genes, to see if the SNP alleles change from a rarely used to frequently used tRNA, for example. This could have consequences on translation rates.

There is something rotten in the ucsc mitochondrial genome :

8.6 years ago by
Hamish wrote:

Checking the Codon Usage Database, and looking under "Mitochondrion" for Homo sapiens, I find "mitochondrion Homo sapiens [gbpri]: 31745" which appears to be what you are after.

Alternatively you could derive the codon frequencies yourself from a mitochondrial genome, e.g. RefSeq:NC_012920 by:

  1. Extracting the coding sequences (CDS) from the RefSeq entry, using a tool such as EMBOSS extractfeat.

  2. Use the coding sequences to calculate the codon frequencies, using a tool such as EMBOSS cusp.

To improve the results obtained, you could use additional data from the other sequenced mitochondria. For this you could obtain the CDS sequences from EMBLCDS, which is a database of coding sequences produced by the European Nucleotide Archive (ENA). Searching the data for NCBI_TaxId:9606, Organelle:mitochondrion and excluding partial sequences, gives 135,153 entries (SRS@EMBL-EBI), which will give much more robust frequencies than the 13 CDS features described in RefSeq:NC_012920.

FWIW a quick search in EMBL-Bank for complete human mitochondrial genomes finds 9,568 entries (SRS@EMBL-EBI).

Thank you, Hamish. I did not see this because I did not think to look under mitochondrion. I am skeptical that this is what I need because it lists data compiled from 31745 CDS's (8998998 codons) and that is far too many for the 16 kbp MT genome.

@Larry That would assume that the table was derived from a single mitochondrial genome. In fact the databases contain many complete mitochondrial genomes from Homo sapiens, see You can use the "List of codon usage for each CDS" linked from the page to identify the specific CDS sequences that where used to generate the table.

