Question: Is Bioinformatics Data Indeed More Open Than Cheminformatics Data?
0
gravatar for Egon Willighagen
3.2 years ago by
Maastricht
Egon Willighagen4.6k wrote:

A running meme is that bioinformatics data is more open than bioinformatics (e.g. mentioned in this blog). It is my feeling too that this is the case, but looking at this spreadsheet with NAR-listed databases (resulting from this BioStar question), already shows many instances where the data cannot be downloaded. Moreover, the sheets give no clue on whether I can modify and redistribute the data, two core rights for Open Data. I, therefore, added two columns to allow annotation with these two aspects. Any non-commercial clause would make it non-open data too. General info can be found at Is It Open Data?

So, my question is basically how Open is bioinformatics data? What is the percentage of data that is in fact Open?

ADD COMMENTlink modified 3.1 years ago by Andra Waagmeester3.0k • written 3.2 years ago by Egon Willighagen4.6k
2

Seeing as you can never quantify how much information is closed, tombed or silo'd away - I'd say this is almost impossible to answer.

ADD REPLYlink written 3.2 years ago by Daniel Swan10k

I guess this NAR database spreadsheet would be a reasonable approximation, not?

ADD REPLYlink written 3.2 years ago by Egon Willighagen4.6k

Daniel is right in principle. The practical comparison we can make is between available but non-open data and real open data, which indeed is what the table is about.

ADD REPLYlink written 3.1 years ago by Chris Evelo8.8k

I am not sure whether about your evaluation of a non-commercial clause. Larger database are expensive structures that are often paid from community paid research projects. They need to be maintained after those projects end, which still is expensive. It makes sense that you pay for the maintenance if you make a profit from using the data. Also it is just not fair to take free open data, wrap it up in nice colored website or tool and sell it. As long as these clauses allow fair usage I think that is fine.

ADD REPLYlink written 3.1 years ago by Chris Evelo8.8k

Chris, I understand your arguments (and have them many times), but isn't the whole idea of making data freely available that people in fact use it? How does a fancy website make maintenance more difficult or more expensive for you, if others help you share it? What defines a profit? Profit is one of the virtues of western civilization; what's wrong with that? How does it hurt the community of the data becomes more accessible because others start distributing it? I don't understand your point... (if you really just worry about attribution, given you mention 'fair', that's a whole other clause.)

ADD REPLYlink written 3.1 years ago by Egon Willighagen4.6k
1
gravatar for Daniel Standage
3.2 years ago by
Daniel Standage3.0k
Bloomington, Indiana, USA
Daniel Standage3.0k wrote:

I work mostly with genomics data (plant genomics to be precise), and I have never had any problems accessing, using, or reusing genomics data. I guess an exception would be when I have been given access to data that is not yet published, but that is quite understandable. In each case when the data was published, everything I had and more became accessible to myself and the general public.

Maybe it's just the nature of the beast: as an academic, you can't spend that much money and effort sequencing, assembling, and annotating a genome unless you plan to make it available as a public information resource.

ADD COMMENTlink written 3.2 years ago by Daniel Standage3.0k
1

@Egon I think I have to agree with you that (prote|metabol)omics data is much more scarce. Perhaps this scarcity is more of an issue than data licensing?

ADD REPLYlink written 3.2 years ago by Daniel Standage3.0k

Yes, genomics data is OK... but how many proteomics and metabolomics databases are there around freely? Last is my field, really, and data are pretty scarce...

ADD REPLYlink written 3.2 years ago by Egon Willighagen4.6k

Well, there is plenty of metabolomics data around, just not freely... dunno so much how that applies to proteomics...

ADD REPLYlink written 3.1 years ago by Egon Willighagen4.6k
1
gravatar for Chupvl
3.2 years ago by
Chupvl420
Toledo, Spain
Chupvl420 wrote:

I think that Bioinformatics is more open because you cannot sell this information, but in Chemoinformatics is more expensive.

ADD COMMENTlink written 3.2 years ago by Chupvl420

You cannot sell biomarkers? The thing is indeed that chemical structures can be patented... I we will see biomarkers patented, if that is not already happening... they patent genes too...

ADD REPLYlink written 3.2 years ago by Egon Willighagen4.6k

I just want to say that chemoinformatics data is more valuable than the bioinformatics one, yep, we still cannot deal with huge amount of biodata up to now, we still new real bioinformatics tools.

Genes and biomarkers patenting is big issue. Gene patents had vague legal status in Europe and US: "It argues that isolated and altered DNA should be patentable, whereas DNA that is simply isolated should not be patentable."

Talking about biomarkers - yes, you can sell them, but in most of the cases this info is really odd. And in my opinion biomarker patent will be also have no legal status.

ADD REPLYlink written 2.9 years ago by Chupvl420
0
gravatar for Andra Waagmeester
3.2 years ago by
Maastricht, the Netherlands
Andra Waagmeester3.0k wrote:

I am not sure if this is a fair comparison. Cheminformatics is a much younger field then Bioinformatics. Having said that resources like the Human Metabolome Database (http://www.hmdb.ca), ChEBI (http://www.ebi.ac.uk/chebi/), kegg compound (http://www.genome.jp/kegg/compound/) are open resources.

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by Andra Waagmeester3.0k
1

Actually, no. See http://blog.rguha.net/?p=913 - pretty similar lineages

ADD REPLYlink written 3.1 years ago by Rajarshi Guha750
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 687 users visited in the last hour