Get Ensembl Gene By Species/Strain For E. Coli
1
1
Entering edit mode
11.7 years ago
Jirapong ▴ 20

Based on http://bacteria.ensembl.org/index.html data. For E. coli database is use single as

escherichiashigellacollectioncore9625a

I'm using Ensembl API version 62 to connect without problem. However, when try to get genes from "e colisakai" and "e colik12". They return same amount of genes (4511).

But if i search on Website - http://bacteria.ensembl.org/Multi/Search/Results?species=all;idx=;q=EBESCG00000001004;genomic_unit=bacteria

I can see this gene specify to only E. coli K12.

My perl script looks like this

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'E coli K12', -dbname=> 'escherichia_shigella_collection_core_9_62_5a'); my$slice_adaptor = $db->get_SliceAdaptor();$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome' my @genes = @{$chromo->get_all_Genes() };
print(Dump(@genes));

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'E coli Sakai', -dbname=> 'escherichia_shigella_collection_core_9_62_5a'); my$slice_adaptor = $db->get_SliceAdaptor();$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome' my @genes = @{$chromo->get_all_Genes() };
print(Dump(@genes));

ensembl • 3.0k views
2
Entering edit mode
11.7 years ago
Andeyatz ▴ 70

What has happened here is that you have not created a multi-species database adaptor to the different strains. There are two ways of getting around this. The first is to use the load registry from DB which will deal with this for you; the second is to explicitly create the DB.

#Using the registry
Bio::EnsEMBL::Registry->load_registry_from_db(-HOST => 'mysql.ebi.ac.uk', -PORT => 4157, -USER => anonymous);
my $ecoli_k12 = Bio::EnsEMBL::Registry->get_DBAdaptor('e_coli_k12', 'core'); my$sakai = Bio::EnsEMBL::Registry->get_DBAdaptor('e_coli_sakai', 'core');


Now for the more manual version

-- SQL to find the species -> species id
select species_id, meta_value from meta where meta_key = 'species.db_name';


This will run a query which at the time of writing e_coli_k12 is species id 1 & sakai is species id 12. So using this information we can go back to your original example:

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'e_coli_k12', -dbname=>'escherichia_shigella_collection_core_9_62_5a', -species_id => 1, multispecies_db => 1); my$slice_adaptor = $db->get_SliceAdaptor();$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome' my @genes = @{$chromo->get_all_Genes() };
print(Dump(@genes));

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'e_coli_sakai', -dbname=> 'escherichia_shigella_collection_core_9_62_5a', -species_id => 12, -multispecies_db => 1); my$slice_adaptor = $db->get_SliceAdaptor();$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome' my @genes = @{$chromo->get_all_Genes() };
print(Dump(@genes));


HTH

0
Entering edit mode

Of course, this help me a lot. Thank you so much.