Question

RNA-seq: how to handle biological replicates for differential expression analysis

0

Entering edit mode

3.6 years ago

GenomicsNewbie • 0

I know that you can use tools like DEseq2 or archR for RNA-seq DE analysis. My question is how do you handle multiple biological replicates. For my datasets, I have 3 biological replicates for healthy samples and 3 biological replicates for diseased samples. I understand that we cannot merge biological replicates, then how do we use these 6 different datasets in DE analysis? I hear about false discovery rate, but does that mean we check for every possible healthy-diseased pair?

RNA-Seq • 1.1k views

ADD COMMENT • link updated 3.6 years ago by rpolicastro 13k • written 3.6 years ago by GenomicsNewbie • 0

score 3 · Answer 1 · 2020-09-04

Let's say that you have 2 conditions with 3 biological replicates each WT-1, WT-2, WT-3, KO-1, KO-2, KO-3. Samples of the same type will be labelled with the same name/factor level, so the factor level of your samples becomes WT, WT, WT, KO, KO, KO.

Both edgeR and DESeq2 will take some sort of design argument. In DESeq2 for example it will be a data.frame with samples as rownames and then columns for your various factor levels. For our example the data.frame would look like this:

> df
     condition
WT-1        WT
WT-2        WT
WT-3        WT
KO-1        KO
KO-2        KO
KO-3        KO

Your regression formula will then be ~ condition for this example dataset for differential expression.