Setting up scRNA-seq differential gene expression analaysis
1
0
Entering edit mode
3.5 years ago
kalanir • 0

Hello! I want to look at the differentially expressed genes between two groups (Healthy vs Infected) in single-cell data. However, I want to control for days post symptoms. How would I go about doing that? In bulk RNAseq I would usually set the design as ~group, group:time, however using MAST using this vignette (https://satijalab.org/seurat/v3.0/de_vignette.html) I don't see that option.

Would it be okay for me to set Healthy as day 7 and run this analysis - regressing time out?

Example data structure: Pt1: Heatlhy Pt2: Healthy Pt3: Infected, 3 days post symptoms Pt4: Infected, 7 days post symptoms Pt5: Infection, 0 days post symptoms

scRNAseq differential gene expresion DEGs RNA-Seq • 1.2k views
ADD COMMENT
2
Entering edit mode
3.5 years ago
ATpoint 82k

I would simply take the count matrix out of the Seurat object and then run whatever testing machinery you like. I personally prefer to aggregate data into pseudobulks and then run edgeR as in any "normal" (=bulk) analysis. Do you have replicates in terms of multiple samples per group? I am not a Seurat (but Bioconductor) user but it looks as the linked vignette focuses on marker gene detection so simple pairwise comparisons between clusters of the same sample. MAST has a Bioconductor package https://www.bioconductor.org/packages/release/bioc/vignettes/MAST/inst/doc/MAST-Intro.html so you could convert your Seurat object to a SingleCellExperiment (there are converters in Seurat for this afaik) and then follow the manual.

ADD COMMENT
0
Entering edit mode

Following up on this, I would recommend the following chapter of the OSCA book: Marker gene detection with blocking factors.

For turning a Seurat object into an SCE object, see the Satija Lab's description

ADD REPLY
0
Entering edit mode

What I found crucial is to filter for a minimum fraction of cells per clusters that express the gene (so have counts > 0) in at least one of the two groups one compares. That saves you from spurious DE results based on a few outlier cells both on the pseudobulk and the single-cell level. I personally use 0.1 (so 10%) by default, for markers that one might want to use for FACS analysis maybe even a higher fraction like 0.2 might make sense.

ADD REPLY

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6