Question

Tool:Intermine: A Flexible Data Warehouse System For Biological Data

7

Entering edit mode

11.9 years ago

Jelena Aleksic ▴ 920

Intermine is a data warehouse system created specifically for the integration and analysis of biological data. Whether you're a biologist, bioinformatician or developer, here are some reasons you might want to find out more about InterMine.

Biologists: there are a number of large integrated databases that run off the InterMine system, including:

FlyMine for the Drosophila community
RatMine at RGD for rat data analysis
YeastMine at SGD for all things yeast (check out the iPhone app too!)
modMine - a data repository for the modENCODE project, containing a large number of fly and worm high throughput datasets
metabolicMine - a metabolic disease database, containing data from human, mouse and rat
TargetMine at NIBIO, Japan - a data warehouse for drug target discovery
MitoMiner - proteomic data for mitochondria

These contain data mining tools such as keyword search, report pages, list analysis, enrichment statistics widgets, interactive graphs, a flexible query builder and region search. So, if, for example, you want to find out what transcription factors mapped by modENCODE bind in the vicinity of your favourite gene, narrow down your list of candidate genes from a gene expression experiment, or browse metabolic disease data from 3 different organisms - you should check out the different Mines and their data analysis tools. The web interface is easy to use, and if you log in, you can save your lists and queries in a private workspace too.

Bioinformaticians: as well as accessing all the data above through the web applications, if you want to automate your workflow, you can also access all intermine functionality using an API. We currently have custom-written client libraries in Python, Perl, Ruby, Java and JavaScript, and are happy to add more based on user demand. To make the scripts easier to write, the InterMine web app can automatically generate the code from queries too.

Developers: InterMine is model-agnostic, so can host any data, but a model has been specifically constructed for biological data using the Sequence Ontology, and can be easily extended by editing an XML file. It is suitable for constructing anything from small to very large databases, has optimised queries and uses a cache to improve performance, comes with an 'out of the box' web app containing existing analysis tools, as well as including RESTful web services. InterMine is fully extensible, so it can be customised and new analysis tools can be added to it. All the code is open source and freely available - check out http://www.intermine.org for details!

genome-analysis • 3.4k views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 11.9 years ago by Jelena Aleksic ▴ 920