Question: Get DNA and Amino Acid Sequence UCSC Genome Browser
0
gravatar for abhikdeora
3.9 years ago by
abhikdeora0
abhikdeora0 wrote:

I'm trying to get both the nucleotide and amino acid sequence for a given region of the mm10 genome, preferably in plain text format.

I know that I can get the DNA sequence for the region by going to http://genome.ucsc.edu/cgi-bin/das/mm10/dna?segment=chr3:93396405,93396500, for example.

Is there a similar link that provides the translated amino acid sequence for the same region in plain text format?

I've tried to use the Table Browser, but it never gives me the exact sequence that I want it to.

Thanks much.

ADD COMMENTlink modified 3.9 years ago by Maximilian Haeussler1.3k • written 3.9 years ago by abhikdeora0
0
gravatar for Pierre Lindenbaum
3.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

You could translate the DNA with the following XSLT stylesheet:

$ xsltproc translate.xsl "http://genome.ucsc.edu/cgi-bin/das/mm10/dna?segment=chr3:93396405,93396500"

>translate(chr3:93396405-93396500)
ARPESSPGSERQTRPESSPGSERQARPESSPG

 

ADD COMMENTlink written 3.9 years ago by Pierre Lindenbaum119k

That XML sheet doesn't seem to translate it in the correct reading frame. Based on the UCSC genome browser, the correct amino acid sequence for mm10 chr3:93396405,93396500 is QDSPHRGQK...

I'm trying to get the amino acid sequence that displays in the UCSC genome browser itself.

ADD REPLYlink written 3.9 years ago by abhikdeora0

that's because the DAS segment starts with 93396406  :

>translate(chr3:93396406-93396500)
QDQSPHRGQKGRQDQSPHQGQKGRQDQSPHR
lindenb@okazaki:~$ xsltproc jeter.xsl "http://genome.ucsc.edu/cgi-bin/das/mm10/dna?segment=chr3:93396406,93396500"

 

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum119k

so if you want to know the amino acid translated for the gene at this position, the stylehsheet won't work. You should work with the knownGene table to get the position of the exon .

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum119k

for example: Is A Genome Position In An Exon Or Intron?

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum119k

So are you saying that there is no direct way to download the amino acid sequence for a given region? That's hard to believe, considering that it is displayed in the genome browser.

That's all I'm looking for. A direct way to download the amino acid sequence for a given region. I'd rather avoid MySQL table lookups.

ADD REPLYlink written 3.9 years ago by abhikdeora0
0
gravatar for Maximilian Haeussler
3.9 years ago by
UCSC
Maximilian Haeussler1.3k wrote:

Can you explain what you're trying to do? It doesn't make a lot of sense to try to get an amino acid for a random piece of DNA. If you need the amino acid sequence, you usually first click on a transcript. The resulting page has a link for the protein sequence.

ADD COMMENTlink written 3.9 years ago by Maximilian Haeussler1.3k

Oh, I start to understand: you're looking at an exon. You can see the amino acid sequence shown on the screen. But if you click on the exon, all you can get is the full amino acid sequence of the whole transcript, not the little piece that you have on the screen.

Can you still explain a little bit more what the final point of this would be? I struggle with finding a use case for this, where this particular function could be useful...

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Maximilian Haeussler1.3k

I'm a rising college freshman with very little bioinformatics experience (although I do have significant programming experience), so please bear with me.

I have a spreadsheet of single nucleotide mutations at certain positions in the mm10 genome (e.g. chr11:3133305 C->A). I'm trying to determine whether those mutations: 1) occur in a coding region of the genome, 2) yield a change in amino acid, and 3) determine what that change is.

The process by which I'm thinking of accomplishing that is this: download the nucleotide and amino acid sequence in a small range around the mutation, determine the reading frame of the DNA sequence, and from there determine the codon that the mutation occurs in. From there, it is trivial to find the change in amino acid resulting from the mutation.

I already have a Java program written that accomplishes the above given the nucleotide and amino acid sequence - I just need a way to download these sequences for a given region.

As I said earlier, I do have very little bioinformatics experience, so I would welcome suggestions for better ways to accomplish my goal.

ADD REPLYlink written 3.9 years ago by abhikdeora0

Use the UCSC VAI, it does exactly that. 

It seems that the pgSnp format is the easiest in your case, just convert your table to this format:

http://genome.ucsc.edu/FAQ/FAQformat.html#format10

1) upload your table as a custom track here http://genome.ucsc.edu/cgi-bin/hgCustom

2) go to the VAI http://genome.ucsc.edu/cgi-bin/hgVai

3) select your custom track and click "get results"

ADD REPLYlink written 3.9 years ago by Maximilian Haeussler1.3k

That seems perfect, thanks a lot! I'll definitely use it.

For the sake of having the question answered, is there a way to download the amino acid sequence for a region in plain text? I'd hate for someone who needs that and arrives at this thread to not find an answer.

ADD REPLYlink written 3.9 years ago by abhikdeora0

There is nothing I know of. You can click on a transcript and get the full amino acid sequence but not the current slice in view, at least not that I know...

ADD REPLYlink written 3.9 years ago by Maximilian Haeussler1.3k

"I have a spreadsheet of single nucleotide mutations at certain positions in the mm10 genome (e.g. chr11:3133305 C->A). I'm trying to determine whether those mutations: 1) occur in a coding region of the genome, 2) yield a change in amino acid, and 3) determine what that change is"

 

so use something like Ensembl VEP : http://www.ensembl.org/info/docs/tools/vep/index.html

 

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1364 users visited in the last hour