Question: Has anyone heard the term data warehousing in genomics before?
0
gravatar for sapuizait
16 months ago by
sapuizait0
sapuizait0 wrote:

There is a job opening that I want to apply for and they mention among other skills "data warehousing knowledge". The job description says that the job has to do with gathering and managing genomics data from hospitals related to cancer but I am not sure what do they mean with the "data warehouse" term.

A random search on the internet shows that it has to do more with companies and corporate decisions but I have never seen the term before in biology research. In another site they mention that a data warehouse is superior to a normal database because it is a collection of databases. Not sure what to make out of all this. I have been using/downloading/editing internet databases extensively and for many years, through their web-interfaces or the terminal, and I am very familiar with them. But this data warehouse term is confusing.

Could it be that it is just a more fancy term, invented by some corporate ....... people for just a next generation fancier online database and it does not require any particular expertise? (In which case I should apply for the job)

Thanks

data warehouse • 357 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by sapuizait0
1

It would be more relevant to the problem in hand if data warehousing in genomics is used in title.

ADD REPLYlink modified 16 months ago • written 16 months ago by Jeffin Rockey1.1k
2
gravatar for Jean-Karim Heriche
16 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche21k wrote:

It's a common term and approach in dealing with biological data. Check for example this paper: A review of genomic data warehousing systems

ADD COMMENTlink written 16 months ago by Jean-Karim Heriche21k

ok I see, I also saw some videos on youtube about biomart. It is thus not so different from other sites that they are "normal" databases. Is the only difference the bigger amount of data and the multiple databases?

ADD REPLYlink written 16 months ago by sapuizait0

Data warehousing doesn't have a clear definition and takes a few different forms but essentially it boils down to having all relevant data under the same "roof". This however doesn't necessarily mean that all data have to sit in the same database but the idea is to organize data into one coherent system. Although data warehousing may involve copying/duplicating data from different sources, the advantage is that there's better control over data access, integration and processing.

ADD REPLYlink written 16 months ago by Jean-Karim Heriche21k

To add on what Jean said above,

In an Information Technology(IT) environment, data warehousing would be also relevant in reporting and business intelligence context. For example, suppose in an airline IT setup, flight/travel records of last year or before need not be held in productions databases. If one do so , the data in aggregate would make each tables (if RDBMS backend) too big which drastically increase querying time/cost.

But those entries will not be deleted to offload weight because this data indeed can provide valuable business information on travel patterns and fare responses etc.

So in practice what would be done is, present and near past data would be held in the main production DB and quite old records systematically gets moved to other db setup which may or may not be in same geography or facility. They would however have good querying and reporting capabilities but very limited update/modify usage.

ADD REPLYlink modified 16 months ago • written 16 months ago by Jeffin Rockey1.1k
0
gravatar for sapuizait
16 months ago by
sapuizait0
sapuizait0 wrote:

Thank you both, you guys have been extremely helpful!

ADD COMMENTlink written 16 months ago by sapuizait0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1201 users visited in the last hour