Question

Statistical test of difference between two conditions and account for the batch

0

Entering edit mode

3 days ago

npont ▴ 20

Hi all,

I work on a scRNA-seq dataset. I computed a module score (AddModuleScore of Seurat) and I want to test if the difference in this score is statistically significant between two conditions.

I have three batches and each contain the two conditions (KO_batch1, WT_batch1, KO_batch2, WT_batch2, KO_batch3, WT_batch3).

I would normally use a Wilcoxon or t-test between condition 1 and condition 2, but doing so I wouldn't account for the batches. It would be testing all cells independently while they're not independent and therefore I would get really low p-values while I shouldn't. I could aggregate the scores, average them per batch and then test if the difference in batches is significant. But I would have a very low power as I would test only three differences somehow (3 batches: module score in condition2 vs. condition1).

So I wonder what is the best to do here? I've heard about linear mixed effect model but I am confused about what they are and how to use them and if they're suited here.

A big thank you for your help:)

significance-testing test seurat scrna-seq batch • 2.9k views

ADD COMMENT • link updated 17 hours ago by ATpoint 89k • written 3 days ago by npont ▴ 20

0

Entering edit mode

Are the batches biological replicates?

ADD REPLY • link 3 days ago by ATpoint 89k

0

Entering edit mode

Yes they are

ADD REPLY • link 19 hours ago by npont ▴ 20

0

Entering edit mode

Great, then I would make a pseudobulk analysis. Test contrasts with a DE tool of choice, for example edgeR, and then use geneset enrichment analysis, for example camera from limma, to test whether there is difference for these genesets (aka modules) across conditions. This is a lot more statistically robust than going with these module scores directly.

ADD REPLY • link 17 hours ago by ATpoint 89k