Online Live TScourse: Python Machine Learning in Biology (July 20th-24th, 2020)strong text
The field of biological sciences is becoming increasingly information-intensive and data-rich. For example, the growing availability of DNA sequence data or clinical measurements from humans promises a better understanding of the important questions in biology. However, the complexity and high-dimensionality of these biological data make it difficult to pull out mechanisms from the data. Machine Learning techniques promise to be useful tools for resolving such questions in biology because they provide a mathematical framework to analyze complex and vast biological data. In turn, the unique computational and mathematical challenges posed by biological data may ultimately advance the field of machine learning as well.
This course will cover the basics of the Python programming language as well as the pandas and sklearn Python libraries for data wrangling and machine learning.
By the end of this course, participants will understand:
- How to input and clean data in Python using the pandas library
- How to perform exploratory data analysis in Python
- How to use the sklearn library in Python for machine learning workflows
- How to choose an appropriate machine learning model for the task
- How to use supervised machine learning models (SVM, Decision Trees, Neural Networks, etc.) for classification tasks
- How to use unsupervised machine learning models for clustering tasks
- How to evaluate machine learning models and interpret their results
This course is intended to give participants a conceptual overview of machine learning algorithms and an intuition for the mathematics underlying them, equipping participants to be able to choose and implement appropriate models for biological datasets.
Requirements: Graduate or postgraduate degree in Life Sciences and a basic knowledge of Statistics. While some Python knowledge is useful, the course will cover basic Python skills necessary to input, clean, and explore data as well as build and evaluate machine learning models.
All participants must have a personal laptop and a good internet connection (Windows, Macintosh, Linux).