open source webserver for multi-omics data store and descriptional visualization
Entering edit mode
4 months ago
Zhilong Jia ★ 2.1k

Any open-source web server (or framework) for multi-omics and sample metadata management and descriptional visualisation is available?

This kind of webserver can be used to store the raw data of multi-omics (such as genomics, transcriptomics, proteomics, metabolomics, microbiome, epigenome), key omics files (e.g. path to files), such as vcf, matrix of expression. Meanwhile, a descriptional visualisation of those matrix data will be better. Thank you.

webserver multi-omics open-source • 372 views
Entering edit mode
4 months ago

Gen3 is probably the closest to what you were thinking of. Self-hosting is possible, but you will probably need full-time engineer(s) in your organization to set up and maintain the system.

As soon as the amount of your data reaches a level where the benefits of having such a system outweighs the effort of setting one up, customizing and maintaining it unfortunately warrants full-time engineers. Here, a LinkedIn engineer nicely elaborates on the challenges of building and maintaining such a system. Many large companies have worked on internal tooling to make datasets discoverable across the whole organization by gathering all metadata into a central data catalogue, and basically all ended up building their own custom systems to meet their demands. Thus, there is no shortage of open-source systems you could customize to manage your metadata, but none will work out of the box and still require substantial work on your side:

There are also some other efforts to build data platforms with a biology/genomics focus, but as far as I know the Elixir Data Catalogue, the European Genomic Data Infrastructure (GDI) and the German Human Genome-Phenome Archive are all work in progress.

For raw data storage, you could also take a look at Hail, but maybe a simple object store with a mantis index is already sufficient for your needs. If you need to serve bioinformatic file formats via a network, various implementations of the htsget protocol (e.g. in Rust) are available. For data versioning, Restic respectively it's reimplementation Rustic could be a relatively straightforward solution that also works without much overhead on the level of single workgroups.


Login before adding your answer.

Traffic: 2647 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6