I'm currently a postdoc with a solid background in bioinformatics, and I've gotten by for the last few years by storing everything I use in flat files. So far it hasn't been a problem, but I can't help feeling like I should know more about databases so that I can use them where appropriate.
Sure, I've used databases. I can write basic SQL statements (
select * from blah) or insert and delete rows. I've used this to design uber-simple web apps in the past. Where I'm deficient is understanding how to design them well. What columns should be indexed? What's all this stuff about normalization? Why would I prefer one database type over another? (MySQL, SQLlite, MongoDB, etc). Which are most commonly used for bioinformatics work, and in what subdomains (genomics, proteomics, MD, etc)?
I'd like the community's help in finding some good tutorials that will get an experienced coder and bioinformatician up to speed in the minimum amount of time possible. Bonus points if they specifically address the types of big-data problems that we're all facing in the new high-throughput world of bioinformatics.