Is It Possible To Find For A Given Gene The Coding Sequences On Ensembl?
2
0
Entering edit mode
13.0 years ago
Blunders ★ 1.1k

This information has been found on NCBI, but attempting to find it on Ensembl.

On NCBI, is was found by:

  • Going to this URL; clearly BRCA2 would have to be replaced if the search was for another gene.
  • Clicking a related result
  • On that page scroll down to "RefSeqs of Annotated Genomes"
  • Within that section, under the subsection "Genomic"
  • Then within Download click "GenBank" for a given reference sequence genome build.

Here's an example of what the page should look like: http://www.ncbi.nlm.nih.gov/nuccore/NC_000013.10?from=32889617&to=32973809&report=genbank

Then on the page scroll down to CDS to see the coding sequences.

cds ensembl • 3.2k views
ADD COMMENT
1
Entering edit mode

Are you asking how you might go from a gene name to an ensembl gene ID? Would biomart be any help, e.g. http://www.biomart.org/biomart/martview or the perl api?

ADD REPLY
4
Entering edit mode
13.0 years ago

in a very similar way, you may do so by using Ensembl's Biomart:

  1. select "ensembl genes 61" as database
  2. select "homo sapiens genes" as dataset
  3. on the "filters" section go to the "gene" subsection the list of geneIDs of interest ("HGNC symbols" would be the selection for what is commonly understood by known gene name)
  4. on the "attributes" section select the "sequences" radio button, go to the "sequences" subsection and choose "exon sequences"

this will give you the information you need, but of course you may play around with all the options available if you want to obtain more information from this great database interface.

ADD COMMENT
1
Entering edit mode

The gene you are referring to, human APC, has 6 non-coding transcripts annotated: http://www.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000134982;r=5:112157646-112170862;t=ENST00000504915. So, that's the reason that BioMart says that for those transcripts no sequence is available when you ask for the CDS.

ADD REPLY
0
Entering edit mode

@Jorge Amigo: Thanks. Any idea what was "Sequence unavailable" might mean? Here's an example for ENSG00000139618 http://www.biomart.org/biomart/martview/dee22528d7a6005fe66e60b279c6dbe6

ADD REPLY
0
Entering edit mode

unfortunately not. if you dig into the BRCA2 gene you will find 6 transcripts, being that ENST00000533776 the 6th, a retained intron which is described as "no protein product". having in mind that my biological background is limited, my guess would be that maybe if you select "coding sequences" strictly they won't be able to provide you with the sequence of an intron. hope this helps.

ADD REPLY
4
Entering edit mode
13.0 years ago
Bert Overduin ★ 3.7k

Addition:

You can also export the coding sequences in FASTA format directly from the Ensembl Gene page.

For example for the human BRCA2 gene:

  • On the BRCA2 gene page click [Export data] in the side menu.
  • Select 'Output: FASTA sequence'.
  • Select 'Genomic: None'.
  • Select 'Coding sequence'.
  • Click [Next>].
  • Click on 'Text'.

When you are interested in multiple genes, of course BioMart (or the Perl API) would be your tool of choice.

ADD COMMENT

Login before adding your answer.

Traffic: 1795 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6