Question

Setting up scRNA-seq differential gene expression analaysis

0

Entering edit mode

3.5 years ago

kalanir • 0

Hello! I want to look at the differentially expressed genes between two groups (Healthy vs Infected) in single-cell data. However, I want to control for days post symptoms. How would I go about doing that? In bulk RNAseq I would usually set the design as ~group, group:time, however using MAST using this vignette (https://satijalab.org/seurat/v3.0/de_vignette.html) I don't see that option.

Would it be okay for me to set Healthy as day 7 and run this analysis - regressing time out?

Example data structure: Pt1: Heatlhy Pt2: Healthy Pt3: Infected, 3 days post symptoms Pt4: Infected, 7 days post symptoms Pt5: Infection, 0 days post symptoms

scRNAseq differential gene expresion DEGs RNA-Seq • 1.2k views

ADD COMMENT • link updated 3.4 years ago by Biostar 20 • written 3.5 years ago by kalanir • 0

score 2 · Answer 1 · 2020-11-08

2

Entering edit mode

3.5 years ago

ATpoint 82k

I would simply take the count matrix out of the Seurat object and then run whatever testing machinery you like. I personally prefer to aggregate data into pseudobulks and then run edgeR as in any "normal" (=bulk) analysis. Do you have replicates in terms of multiple samples per group? I am not a Seurat (but Bioconductor) user but it looks as the linked vignette focuses on marker gene detection so simple pairwise comparisons between clusters of the same sample. MAST has a Bioconductor package https://www.bioconductor.org/packages/release/bioc/vignettes/MAST/inst/doc/MAST-Intro.html so you could convert your Seurat object to a SingleCellExperiment (there are converters in Seurat for this afaik) and then follow the manual.

ADD COMMENT • link 3.5 years ago by ATpoint 82k

0

Entering edit mode

Following up on this, I would recommend the following chapter of the OSCA book: Marker gene detection with blocking factors.

For turning a Seurat object into an SCE object, see the Satija Lab's description

ADD REPLY • link 3.4 years ago by Friederike 8.9k

0

Entering edit mode

What I found crucial is to filter for a minimum fraction of cells per clusters that express the gene (so have counts > 0) in at least one of the two groups one compares. That saves you from spurious DE results based on a few outlier cells both on the pseudobulk and the single-cell level. I personally use 0.1 (so 10%) by default, for markers that one might want to use for FACS analysis maybe even a higher fraction like 0.2 might make sense.

ADD REPLY • link 3.4 years ago by ATpoint 82k