Statistics and R for the Life Sciences from Harvard University

An introduction to basic statistical concepts and R programming skills necessary for analyzing data in the life sciences.

We will learn the basics of statistical inference in order to understand and compute p-values and confidence intervals. We will provide examples by programming in R in a way that will help make the connection between concepts and implementation. Problems sets requiring R programming will be used to test understanding and ability to implement basic data analyses. We will use visualization techniques to explore new data sets and determine the most appropriate approach. We will describe robust statistical techniques as alternative when data do not fit assumptions required by the standard approaches. We will also introduce the basics of using R scripts to conduct reproducible research.

Topics:

- Distributions
- Exploratory Data Analysis
- Inference
- Non-parametric statistics

This 5-week course is the first in an eight part series on Data Analysis for Genomics (version 2)

PH525.1x: Statistics and R for the Life Sciences

PH525.2x: Introduction to Linear Models and Matrix Algebra

PH525.3x: Advanced Statistics for the Life Sciences

PH525.4x: Introduction to Bioconductor

PH525.5x: Case study: RNA-seq data analysis

PH525.6x: Case study: Variant Discovery and Genotyping

PH525.7x: Case study: ChIP-seq data analysis

PH525.8x: Case study: DNA methylation data analysis

Version 2 of Data Analysis for Genomics is based on the book of version 1 and a lot of feedback, including this one.