News:Reproducible Bioinformatics Project
0
0
Entering edit mode
6.6 years ago

Reproducible Bioinformatics Project

Reproducible research is a key component of the scientific method and represents the ability of repeating an experiment in any place with any person.

A study can be truly reproducible when it satisfies at least the following three criteria.

  • All methods are fully reported.
  • All data and files used for the analysis are (publicly) available.
  • The process of analyzing raw data is well reported and preserved.

( Extracted from https://www.r-bloggers.com/what-is-reproducible-research/)

The above points should also apply to Bioinformatics, however today being able to reproduce a bioinformatics analysis is not guarantee by having access to raw data and to the process used for data analysis. Lack of reproducible results can be due to unclear explanation of the analytical process or differences in the system libraries, which might lead to sneaky reproducibility issues.

To address the above points we have setup the Reproducible Bioinformatics Project, which is a non-profit and open-source project, aiming to provide reproducible results in Bioinformatics using Docker images. Specifically the project is based on the creation of easy to use Bioinformatics workflows that fullfill the following roles (Sandve et al. PLoS Comp Biol. 2013):

  1. For Every Result, Keep Track of How It Was Produced
  2. Avoid Manual Data Manipulation Steps
  3. Archive the Exact Versions of All External Programs Used
  4. Version Control All Custom Scripts
  5. Record All Intermediate Results, When Possible in Standardized Formats
  6. For Analyses That Include Randomness, Note Underlying Random Seeds
  7. Always Store Raw Data behind Plots
  8. Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
  9. Connect Textual Statements to Underlying Results
  10. Provide Public Access to Scripts, Runs, and Results

Today three workflows are available:

RNAseq workflow
miRNAseq workflow
ChIPseq workflow

And other three are under development:

PDX workflow: variants calling in patient derived xenograft (PDX) from RNAseq and EXOMEseq data
Single cell analysis workflow
Metagenomics workflow

We are looking for Bioinformaticians interested to be part of the Reproducible Bioinformatics Community. Bioinformaticians interested to embed specific applications in the available workflows or interested to develop a new workflow are requested to embed the application(s) in a docker image, save it in a public repository and configure one or more R functions that can be used to interact with the docker image.

More info on the project and how to be part of it at http://www.reproducible-bioinformatics.org/ or contact raffaele.calogero@unito.it

genome ChIP-Seq next-gen R rna-seq • 1.6k views
ADD COMMENT

Login before adding your answer.

Traffic: 2628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6