Question

Amino Acid Position In Ucsc Browser

3

Entering edit mode

13.4 years ago

jvijai ★ 1.2k

How do you get the amino acid position of a particular protein from the UCSC browser. The link http://bit.ly/hiGIMu shows the Methionine in green. How to find the amino acid position of this residue?

ucsc amino-acids translation • 6.9k views

ADD COMMENT • link updated 13.4 years ago by Pierre Lindenbaum 161k • written 13.4 years ago by jvijai ★ 1.2k

1

Entering edit mode

To answer that question you'd have to say, "position in a particular protein sequence". After alternative splicing or alternative start sites (for instance) you can have many different answers to this question. There are usually one or more canonical reference sequence(s); maybe you should narrow the question.

ADD REPLY • link 13.4 years ago by David Quigley 11k

0

Entering edit mode

In this case, it is uc001opa.1 (CCND1) length=295 Thanks for pointing it out.

ADD REPLY • link 13.4 years ago by jvijai ★ 1.2k

Ram · Answer 1 · 2010-12-15

You can use the UCSC mysql server to get the positions of the exons.

 mysql  -h  genome-mysql.cse.ucsc.edu -A -u genome -D hg18 -e \
       'select * from knownGene where name="uc001opa.1"\G'
*************************** 1. row ***************************
      name: uc001opa.1
     chrom: chr11
    strand: +
   txStart: 69165053
     txEnd: 69178423
  cdsStart: 69165262
    cdsEnd: 69175231
 exonCount: 5
exonStarts: 69165053,69166979,69167780,69171942,69175066,
  exonEnds: 69165460,69167195,69167940,69172091,69178423,
 proteinID: P24385
   alignID: uc001opa.1

Here your protein contains 5 exons (the first exon starts at 69165053 and ends 69165460 -1 . It also contains the first translated base at at 69165262.

Using the reference sequence for 'chr11', you then 'just' have to walk over each exon to translate your protein until you've found your amino acid.

UPDATE: even easier. There is a table named knownGenePep containing the peptide for a given KnownGene.

 mysql  -h  genome-mysql.cse.ucsc.edu -A -u genome -D hg18 -e 'select * from knownGene as K, knownGenePep as P where P.name=K.name and K.name="uc001opa.1"\G'
*************************** 1. row ***************************
      name: uc001opa.1
     chrom: chr11
    strand: +
   txStart: 69165053
     txEnd: 69178423
  cdsStart: 69165262
    cdsEnd: 69175231
 exonCount: 5
exonStarts: 69165053,69166979,69167780,69171942,69175066,
  exonEnds: 69165460,69167195,69167940,69172091,69178423,
 proteinID: P24385
   alignID: uc001opa.1
      name: uc001opa.1
       seq: MEHQLLCCEVETIRRAYPDANLLNDRVLRAMLKAEETCAPSVSYFKCVQKEVLPSMRKIVATWMLEVCEEQKCEEEVFPLAMNYLDRFLSLEPVKKSRLQLLGATCMFVASKMKETIPLTAEKLCIYTDNSIRPEELLQMELLLVNKLKWNLAAMTPHDFIEHFLSKMPEAEENKQIIRKHAQTFVALCATDVKFISNPPSMVAAGSVVAAVQGLNLRSPNNFLSYYRLTRFLSRVIKCDPDCLRACQEQIEALLESSLRQAQQNMDPKAAEEEEEEEEEVDLACTPTDVRDVDI