I'm currently a postdoc with a solid background in bioinformatics, and I've gotten by for the last few years by storing everything I use in flat files. So far it hasn't been a problem, but I can't help feeling like I should know more about databases so that I can use them where appropriate.
Sure, I've used databases. I can write basic SQL statements (select * from blah
) or insert and delete rows. I've used this to design uber-simple web apps in the past. Where I'm deficient is understanding how to design them well. What columns should be indexed? What's all this stuff about normalization? Why would I prefer one database type over another? (MySQL, SQLlite, MongoDB, etc). Which are most commonly used for bioinformatics work, and in what subdomains (genomics, proteomics, MD, etc)?
I'd like the community's help in finding some good tutorials that will get an experienced coder and bioinformatician up to speed in the minimum amount of time possible. Bonus points if they specifically address the types of big-data problems that we're all facing in the new high-throughput world of bioinformatics.
Your question would be a lot easier to answer (and research as well) if you reduced its scope. The expression "biological databases" is too generic, one would most certainly use different techniques when storing intervals vs storing a reads or microarray data etc
The most important question is what problems you want to solve. Learning purely for the purpose of learning will not teach you much. Are you going to design a database targeting at NAR database issue? If not, do not waste your time on SQL.