Forum: Which database software is better for bioinformatics web development? MySQL or MongoDB?
gravatar for mcclintock
11 months ago by
mcclintock10 wrote:

Once I did a bioinformatics project using MongoDB, but I'm kind of confused which database software is more suitable for bioinformatics web development. Sure, MySQL, relational database, is great. However, what's the difference between the performance of these two kinds of database, relational and non-relational? PS: Does it depends on whether we use PHP or not?

ADD COMMENTlink modified 11 months ago by Eric Lim1.4k • written 11 months ago by mcclintock10
gravatar for Michael Dondrup
11 months ago by
Bergen, Norway
Michael Dondrup46k wrote:

This is a very general question, in particular what is bioinformatics web development? It really depends on the application case and requirements, and certainly the type of data to store. Note: Database Management Systems (DBMS) are a back-end technology meaning they are used for storing and retrieving the data in a reliable and performant way. web development refers to front-end technology, that is the presentation of the application to the user and how the application is interacting with the user. Deciding on the technology should be the last step in the design process of a software.

Some questions to guide the decision process:

  • How is your data structured, can the data be represented in relational format?
  • Have you modeled the data using UML/ER diagrams?
  • Volume of your data?
  • Do you update/read often or rarely?
  • Do you update/read in parallel?
  • Do you need transactions, atomic actions, ACID, to ensure consistency?
  • Do you need foreign keys, indexing, stored procedures, views?
  • Distributed?
  • Other performance boosting required?
  • Replication required?
  • Do you need a well-defined Query language (SQL, XPATH) or can you hardcode a few queries?
  • can you setup a DB server in client server infrastructure?
  • do you need remote connections (CLI, TCP, web-services) to the database?
  • which authentication/authorization model do you use for your application?
  • which backup strategy do you prefer/policies are in place at the site of deployment?
  • Do language bindings to preferred frame-work/coding tools/languages exist at the appropriate level of abstraction?
  • Which support model do you prefer? (Some might prefer a commercial support model)


  • PostgreSQL (most feature-rich, covers practically everything named above, richest SQL, including Document database/NoSQL functions, will work most of the time)
  • MySQL, MariaDB, etc. (very similar in feature richness, maybe a bit higher performance)
  • SQLite (file based SQL database management, no server required, great for testing, should fit most bioinformatics needs, unless many concurrent updates, easy to share as a file - no dump required, easiest way to start and to learn SQL)
  • Oracle (if you need a commercial DBMS)
  • Large variable matrices (microarrays, RNA-seq data, other measurements) or array-shaped data with variable dimensions or loosely defined structure are not such a good fit for Relational Databases, HDF5 or similar formats could be a reasonable alternative, possibly keeping metadata in MongoDB.
ADD COMMENTlink modified 11 months ago • written 11 months ago by Michael Dondrup46k

Great idea! There are so many things I never learned. Thanks, it really help me a lot.

ADD REPLYlink written 11 months ago by mcclintock10

in light of the 'bioinformatics' side of the question, I think it also becomes important to consider where you plan to run the app from. A lot of bioinformatics work takes place on shared systems like HPC's, which have restricted user-rights and software access; in these cases, I end up going with SQLite a lot because its often the easiest to get running there.

ADD REPLYlink written 11 months ago by steve2.2k
gravatar for Raony Guimarães
11 months ago by
Dublin / Ireland
Raony Guimarães1000 wrote:

You could use PostgreSQL and JsonB to get the best of both worlds: relational and non-relational features.

ADD COMMENTlink written 11 months ago by Raony Guimarães1000

Thank you very much! I will give it a try.

ADD REPLYlink written 11 months ago by mcclintock10
gravatar for Eric Lim
11 months ago by
Eric Lim1.4k
Eric Lim1.4k wrote:

Micheal has given and asked some really excellent questions, so I'll just share what we've been doing.

Our approach is developing and deploying events driven web applications using serverless technology ( ) and couple these applications with various serverless database services (,, and ). The idea behind this approach is to reduce the group's burden in maintaining running servers so we can focus on the development and analytics.

ADD COMMENTlink modified 11 months ago by genomax70k • written 11 months ago by Eric Lim1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1153 users visited in the last hour