Question

Forum:What are the 5 biggest challenges/ opportunities in Bioinformatics going into 2021?

1

Entering edit mode

4.5 years ago

Parth Patel ▴ 50

I would love to hear folks' candid thoughts on the biggest challenges / opportunities in the field of Bioinformatics? I plan on using this is a starting point for some research, so any books, articles, videos etc would be much appreciated :)

research challenges • 5.1k views

ADD COMMENT • link updated 2.3 years ago by Ram 45k • written 4.5 years ago by Parth Patel ▴ 50

3

Entering edit mode

Define bioinformatics.

ADD REPLY • link 4.5 years ago by Jean-Karim Heriche 27k

1

Entering edit mode

Ranking open problems by 'biggest challenge' is a tough one for me. Some centers have service problems, how to get WGS results or cancer somatic variants detected ASAP. Those don't rank against scientific advancements like 3d chromatin orientation or protein docking.

ADD REPLY • link 4.5 years ago by karl.stamm 4.1k

score 3 · Answer 1 · 2021-01-03

In no particular order:

Standardisation of methods used in clinical practice (may very well be region and country-specific)
Certification of who can call themselves a bioinformatician
Data curation
Increasing compute capacity (we are already reaching limits with large single-cell datasets)
Training of new bioinformaticians

Kevin

score 3 · Answer 2 · 2021-01-03

3

Entering edit mode

4.5 years ago

GenoMax 152k

Creating approved tools that adhere to standards mandated by regulatory agencies (e.g. FDA) such that they can be used by regular users (think Physicians). These tools need to produce results/reports that can make sense to respective users.
Creating workflow/pipeline tools that can be used/understood by people who are not programmers
Making cloud computing user accessible.

Edit: I should note that problem's described by @Ian are in the domain of computational biologists/statisticians. My list is from perspective of an applied bioinformatician.

ADD COMMENT • link 4.5 years ago by GenoMax 152k

0

Entering edit mode

Ah, the old computational biology vs bioinformtics debate. Yes, I guess I agree the problems I highlighted are computation biology rather than bioinformatics.

ADD REPLY • link 4.5 years ago by i.sudbery 21k

score 2 · Answer 3 · 2021-01-03

As was alluded to above, its difficult to say what the biggest open problems in bioinformatics are because the position that bioinformatics occupies as as an enabler of other things. Thus many of the big open problems in bioinformtiacs are about infrastructure and don't require the skills we normally think of as bioinformatics skills (computer science, statistics, biological knowledge) and are actaully informatics problems and social problems (see @Kevin Blighe and @GenoMax's answers). These are actaully proper bio-informatics problems, but they are not the sort of problem that many people coming into bioinformatics want to solve (perhaps why they are still unsolved).

The other categories of problem are not bioinformatics problems, but biology problems that need bioinformaticians solutions.

I don't know about the most important, but some things I'd like to see tackled in 2021, from my perspective as someone interested in transcriptomics and gene-regulation:

Proper statistical models, with theoretical, as well as empirical, justification for cross-technique comparison (e.g. comparing whole transcript scRNA-seq to UMI-tagged scRNAseq or either of those to bulk RNAseq, but in general any two datasets generated for negative binomial processes with unknown systematic and random biases).
In a similar vain: routine extraction of biological parameters from single-cell data beyond just cell-type identity/differentiation state/linage. E.g. I'd love to see algorithms that used measurements of differential variability in single-cell data to imply conclusions about the structure and mechanisms of regulation happening.
A perennial favorite, that I don't think is yet fully solved: identification of functionally relevant non-coding mutations (in both non-transcribed, and transcribed, but non-coding, sequence). Under-explored avenue that I see here, is the use of the large human variation datasets (e.g. gnomad) to explore within-species constraint in non-coding space.
Joint estimation of expression, genotype and genotype:expression interactions (allelic imbalance) from RNAseq data, including the use of replicates (both within individual and between individual).