The nature of data in biomedicine is rapidly changing from low-throughput and easy to manage data types like temperature, heart beat or blood pressure. New data types like genomics, transcriptomics and metagenomics are very different – they require a new approach often leveraging complex statistical methods to detect patterns.
In this hands-on workshop, we plan to discuss how one can process such datasets and prepare them for machine learning exploration and classification as well as render meaningful visualizations to help explain and use this data. To avoid technical complications, we will be relying on a powerful T-BioInfo platform that allows one to assemble complex pipelines and focus on the logic of analysis while having access to simple scripts in R that one can learn to modify and improve for effective visualization.
THE WORKSHOP CAN BE ACCESSED ONLINE! – reach out for more details.
Part 1, June 27: Review of high-throughput data formats and processing techniques with considerations around the biology represented in such datasets. Topic we will cover include standard approaches to creating a gene expression table and quantification of reads; statistical analysis of differential gene expression; annotation of gene IDs for downstream analysis and examples of biomedical research projects utilizing such approaches for precision medicine.
Part 2, June 28: Machine Learning techniques for data exploration and classification using high-throughput data. PCA and clustering as well as feature selection for conventional machine learning methods like Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM). Data-driven workflows in biomedical projects that utilize high-throughput biomedical data. Visualization techniques using R libraries like GGPLOT; reproducible workflows using R markdown and data exploration using visualization tools.
This schedule is subject to change, please keep in touch for updates or visit: https://edu.t-bio.info/nyu-june-workshop/