Fetch gene coordinates from Ensembl database for GRCh37 via Perl API
1
0
Entering edit mode
9.6 years ago

I have a list of ensembl gene_ids. I am trying to fetch the coordinates for the list of genes from ensembl API, but I could not figure out.

ENSG00000000938 ENSG00000000971 ENSG00000001036 ENSG00000001084 ENSG00000001167 ENSG00000001460 ENSG00000001461 ENSG00000001497 ENSG00000001561 ENSG00000001617 ENSG00000001626 ENSG00000001629 ENSG00000001630 ENSG00000001631

Can any body help me? I could get it from UCSC Table Browser but I need them from Ensembl.

Ensembl API perl • 3.9k views
ADD COMMENT
0
Entering edit mode

What have you tried (i.e., show us the code you're using now that's not working)? BTW, is there a reason you want to use the perl API? You could also just use R and simply load the appropriate annotation package.

ADD REPLY
0
Entering edit mode

I do not know much of R. So, I thought of using Perl API, but in the tutorial I could not find any examples to get coordinates and in API, I could not find the function related to fetch coordinates by gene name.

ADD REPLY
0
Entering edit mode

Also, why do you think the coordinates would be different if you're using the same genome build? The only difference between UCSC and Ensembl coordinates is the chromosome name, which is trivial to alter.

ADD REPLY
0
Entering edit mode

I can try this, but as I use iGenomes builds, it would be good if I do some hands on Ensembl databases using their API.

ADD REPLY
2
Entering edit mode
9.6 years ago
Emily 23k

You need to use the GeneAdaptor to fetch_by_stable_id. Then you can use the seq_region_name, seq_region_start and seq_region_end methods on the Gene module to get the coordinates.

For basic API use, take a look at our free online course.

ADD COMMENT
0
Entering edit mode

Thanks Emily. It worked. :)

use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::Slice;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
    -user => 'anonymous',
    -port => 5306 );

my $gene_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Gene' );
my $gene = $gene_adaptor->fetch_by_stable_id( 'ENSG00000099889');

print $gene->seq_region_start();
print "\n";
print $gene->seq_region_end();

But I want the coordinates of GRCh37 instead of GRCh38. Where should I incorporate the version in my code?

ADD REPLY
1
Entering edit mode

The API will always access the database from which it was downloaded. If you use the release 76 API, you'll access the release 76 database which is GRCh38. To get the old API (old database, old assembly), just install the API from our archive site.

ADD REPLY
0
Entering edit mode

Hi Emily, both the sites, latest one and grch37, shows the same installation instructions.

http://grch37.ensembl.org/info/docs/api/api_installation.html

http://www.ensembl.org/info/docs/api/api_installation.html

How would I get the grch37 api? Could you point me to the archived api?

I think I should follow the github installation instructions? http://feb2014.archive.ensembl.org/info/docs/api/api_git.html

ADD REPLY
0
Entering edit mode

The easiest way is probably to download the tarball from the archive site.

ADD REPLY

Login before adding your answer.

Traffic: 2080 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6