What are some areas where machine learning is applicable in biology and cancer?
4
0
Entering edit mode
4.3 years ago

I've only heard of ML being used in diagnostics through image classification, but what are some other topics related to ML? Anything related to cancer? Research papers would be helpful too.

software deep learning • 1.7k views
ADD COMMENT
2
Entering edit mode

If you want papers simply search PubMed for machine learning AND cancer, there is a lot of material available.

ADD REPLY
0
Entering edit mode

Hello isaacbruth1234!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/11055/what-are-some-areas-where-machine-learning-is-applicable-in-biology-and-cancer

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
3
Entering edit mode
3.0 years ago
krysgourlia ▴ 30

You can apply Machine Learning in cancer biology in general:

  1. To classify sub-types of cancer, healthy vs cancer patients, according to sex etc based on transcriptomics, proteomics and metabolomics profiles
  2. To predict via regression analysis the survival of the patients according to therapy response, biometrics data, sex, ethnicity etc

Generally, via Machine Learning you can predict ctRNAs, microRNAs, genes, proteins or characteristics such as age, sex etc that are involved in cancer development, therapy etc. For Biologists and Medical doctors who can't "write code" or have programming knowledge, it is easier to use Auto ML services in which you just add your data and get your results fast and easy. Personally as a cancer Biologist, I use JADBio, which is an Auto- ML platform that can analyse biomedical data with small sample sizes or very large feature sets. . It stood out from other Auto- ML tools for its friendly user interface and the comments which provides in the results analysis that help the non- experts to evaluate their results. The main benefit is that it provides leading-edge AI tools and automation capabilities enabling life-science professionals to build and deploy accurate and explainable predictive models with speed and ease, even if they have no data science expertise.

Hope this helps!

ADD COMMENT
2
Entering edit mode
4.3 years ago
dsull ★ 5.8k

Machine learning is very broad... A lot of the unsupervised machine learning methods (e.g. PCA, hierarchical clustering, t-SNE, etc.) are used all the time in microarray/RNA-seq analysis. Even basic supervised methods such as linear regression (used almost everywhere) is considered "machine learning".

A lot of differential gene expression analysis models and RNA-seq quantification models use machine learning concepts to maximize some likelihood or minimize some loss to make predictions (e.g. predicting which transcript a sequencing read comes from via expectation-maximization). Even in 2019, people are still writing up papers implementing machine learning approaches for RNA-seq analysis: https://www.ncbi.nlm.nih.gov/pubmed/30664774

Heck, even deep learning methods like convolutional neural networks (CNNs) are used all the time to study the genome! Take a look here: https://www.ncbi.nlm.nih.gov/pubmed/27197224 -- training CNNs on DNA sequences to predict DNA accessibility!

These are all genomics things but I apply genomics to the study of cancer all the time! I'll just pick out a couple of Nature papers from the bibliography for a cancer genomics manuscript that I'm currently writing up:

https://www.ncbi.nlm.nih.gov/pubmed/25043018 - Here they use support vector machines (SVMs) to classify neuroblastomas by the oncogene MYCN's amplification status.

https://www.ncbi.nlm.nih.gov/pubmed/16273092 - Here's another paper that uses supervised classification to classify microarray gene expression profiles of cells as oncogene-activated versus not (and they also use it for feature selection to figure out a list of genes that allow the accurate classification -- this is an example of how we derive a cancer "gene signature").

The question should be: What are some areas where machine learning is NOT used? :)

ADD COMMENT
1
Entering edit mode
4.3 years ago

Mainly it has applicability in:

  • classification of tumours into sub-types
  • prediction of response to therapy

Both of these are ultimately related, one could argue.

Any ML algorithm / model working in this space should be able to utilise data from multiple sources:

  • copy number and structural variants (large and small)
  • somatic variants
  • germline variants
  • expression data
  • proteomics data
  • clinical and demographic data
  • imaging data

However, issues relate to this in the context of the biopsy that is taken and analysed. For example, is the biopsy representative of the entire tumour bulk?; what if we take two biopsies from the same individual and end up with 2 completely different results?

This is another area where there is never a 'one size catch all' solution for cancer. Low cost monitoring methods are therefore important, but never receive too much focus.

For papers, you can search via your search engine of choice.

Kevin

ADD COMMENT
1
Entering edit mode
4.3 years ago
jgreener ▴ 390

To add to the other answers, machine learning and in particular deep neural networks have revolutionised the field of protein structure prediction. See some (slightly over-hyped) press coverage at https://www.theguardian.com/science/2018/dec/02/google-deepminds-ai-program-alphafold-predicts-3d-shapes-of-proteins.

To draw some examples from our own research, we have a review paper at https://onlinelibrary.wiley.com/doi/full/10.1002/prot.25824 discussing some of the recent advances.

We also have research at https://www.nature.com/articles/s41467-019-11994-0 showing that these recent advances increase the coverage of structural modelling of various genomes. This may have medical implications in the future.

ADD COMMENT

Login before adding your answer.

Traffic: 2676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6