Translating A Mitochondrial Cdna
11.4 years ago

Here is the cDNA for the UCSC knowngene uc011mfh.1 , a protein said to be on the chrM:5,855-7,427. http://genome.ucsc.edu/cgi-bin/hgTracks?org=Human&db=hg19&position=uc011mfh.1

ATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTCATCTTTCTTTTCACCG
TAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTA
CTACACGACACGTACTACGTTGTAGCTCACTTCCACTATGTCCTATCAAT
AGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTAT
TCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCACTATC
ATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCT
ATCCGGAATGCCCCGACGTTACTCGGACTACCCCGATGCATACACCACAT
GAAACATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATA
TTAATAATTTTCATGATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCT
AATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCC
CACCCTACCACACATTCGAAGAACC


and here is the protein for this gene ( http://genome.ucsc.edu/cgi-bin/hgGene?hgg_do_getProteinSeq=1&hgg_gene=uc011mfh.1 )

>uc011mfh.1 (BC018860) length=175
MICCSALSPRIHLSFHRRWPDWHCISKLITRHRTTRHVLRCSSLPLCPINRSCICHHRRL
HSLISPILRLHPRPNLRQNPFHYHIHRRKSNFLPTTLSRPIRNAPTLLGLPRCIHHMKHP
IICRLIHFSNSSNINNFHDLRSLRFEAKSPNSRRTLHKPGVTIWMPPTLPHIRRT


translating the cDNA using the standard genetic code at http://www.ebi.ac.uk/Tools/emboss/transeq/ returns the same peptide:

>EMBOSS_001_1
MICCSALSPRIHLSFHRRWPDWHCISKLITRHRTTRHVLRCSSLPLCPINRSCICHHRRL
HSLISPILRLHPRPNLRQNPFHYHIHRRKSNFLPTTLSRPIRNAPTLLGLPRCIHHMKHP
IICRLIHFSNSSNINNFHDLRSLRFEAKSPNSRRTLHKPGVTIWMPPTLPHIRRT


but if the mitochondrial genetic code is used, the protein contains many stops:

>EMBOSS_001_1
MICCSALSP*IHLSFHR*WPDWHCISKLIT*HRTTRHVLRCSSLPLCPIN*SCICHH**L
HSLISPIL*LHP*PNLRQNPFHYHIHRRKSNFLPTTLSRPIRNAPTLLGLPRCMHHMKHP
IIC*LIHFSNSSNINNFHDL*SLRFEAKSPNS**TLHKPGVTMWMPPTLPHIR*T


is it me or is it a bug ?

Pierre

translation mitochondria cdna ucsc • 2.3k views
Indulging a bit of curiosity, I took the DNA fragment and searched Ensembl with it to see what they said. There this turns out to be the tail end of MT-CO1-201, which covers MT: 5,904-7,445. The mitochondrial gene models in Ensembl come from the source entries GenBank J01415 and RefSeq NC_012920. So it appears that, in the case at least, re-annotation was not a good idea.

All the stops are 'R's which is arginine. Arginine is a stop codon in mitochondria.

11.4 years ago
Neilfws 49k

I get more or less the same result and so can only conclude that UCSC have used the standard, not mitochondrial, genetic code to translate in this case.

This is not unprecedented - see this recent discussion on the UCSC mailing list. Their response is that it's on their to-do list.

Thank you Neil, for pointing this thread on the UCSC mailing list.

11.4 years ago
dfornika ★ 1.1k

Is it possible that the cDNA sequence was imputed from the amino acid sequence, using the standard translation table?

no, my cDNA was constructed from the genomic DNA and the structure of the gene.