Has anyone heard the term data warehousing in genomics before?
3
0
Entering edit mode
5.4 years ago
sapuizait ▴ 10

There is a job opening that I want to apply for and they mention among other skills "data warehousing knowledge". The job description says that the job has to do with gathering and managing genomics data from hospitals related to cancer but I am not sure what do they mean with the "data warehouse" term.

A random search on the internet shows that it has to do more with companies and corporate decisions but I have never seen the term before in biology research. In another site they mention that a data warehouse is superior to a normal database because it is a collection of databases. Not sure what to make out of all this. I have been using/downloading/editing internet databases extensively and for many years, through their web-interfaces or the terminal, and I am very familiar with them. But this data warehouse term is confusing.

Could it be that it is just a more fancy term, invented by some corporate ....... people for just a next generation fancier online database and it does not require any particular expertise? (In which case I should apply for the job)

Thanks

data warehouse • 1.2k views
ADD COMMENT
1
Entering edit mode

It would be more relevant to the problem in hand if data warehousing in genomics is used in title.

ADD REPLY
2
Entering edit mode
5.4 years ago

It's a common term and approach in dealing with biological data. Check for example this paper: A review of genomic data warehousing systems

ADD COMMENT
0
Entering edit mode

ok I see, I also saw some videos on youtube about biomart. It is thus not so different from other sites that they are "normal" databases. Is the only difference the bigger amount of data and the multiple databases?

ADD REPLY
0
Entering edit mode

Data warehousing doesn't have a clear definition and takes a few different forms but essentially it boils down to having all relevant data under the same "roof". This however doesn't necessarily mean that all data have to sit in the same database but the idea is to organize data into one coherent system. Although data warehousing may involve copying/duplicating data from different sources, the advantage is that there's better control over data access, integration and processing.

ADD REPLY
0
Entering edit mode

To add on what Jean said above,

In an Information Technology(IT) environment, data warehousing would be also relevant in reporting and business intelligence context. For example, suppose in an airline IT setup, flight/travel records of last year or before need not be held in productions databases. If one do so , the data in aggregate would make each tables (if RDBMS backend) too big which drastically increase querying time/cost.

But those entries will not be deleted to offload weight because this data indeed can provide valuable business information on travel patterns and fare responses etc.

So in practice what would be done is, present and near past data would be held in the main production DB and quite old records systematically gets moved to other db setup which may or may not be in same geography or facility. They would however have good querying and reporting capabilities but very limited update/modify usage.

ADD REPLY
0
Entering edit mode
5.4 years ago
sapuizait ▴ 10

Thank you both, you guys have been extremely helpful!

ADD COMMENT
0
Entering edit mode
3.1 years ago
ahatest007 • 0

Try the following It describes the basics of data warehousing

Essentially, a data warehouse is just a database that has been set up for reporting and analysing data. The data is set up so that it enables the running of queries across large datasets quickly. Whilst a transactional database is all about getting data into the system as quickly as possible and accessing the latest data; a data warehouse will look at historical data over the years and decades to help better understand trends.

ADD COMMENT

Login before adding your answer.

Traffic: 2372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6