A running meme is that bioinformatics data is more open than bioinformatics (e.g. mentioned in this blog). It is my feeling too that this is the case, but looking at this spreadsheet with NAR-listed databases (resulting from this BioStar question), already shows many instances where the data cannot be downloaded. Moreover, the sheets give no clue on whether I can modify and redistribute the data, two core rights for Open Data. I, therefore, added two columns to allow annotation with these two aspects. Any non-commercial clause would make it non-open data too. General info can be found at Is It Open Data?
So, my question is basically how Open is bioinformatics data? What is the percentage of data that is in fact Open?
I work mostly with genomics data (plant genomics to be precise), and I have never had any problems accessing, using, or reusing genomics data. I guess an exception would be when I have been given access to data that is not yet published, but that is quite understandable. In each case when the data was published, everything I had and more became accessible to myself and the general public.
Maybe it's just the nature of the beast: as an academic, you can't spend that much money and effort sequencing, assembling, and annotating a genome unless you plan to make it available as a public information resource.
I am not sure if this is a fair comparison. Cheminformatics is a much younger field then Bioinformatics. Having said that resources like the Human Metabolome Database (http://www.hmdb.ca), ChEBI (http://www.ebi.ac.uk/chebi/), kegg compound (http://www.genome.jp/kegg/compound/) are open resources.