Question: Differential expression testing in Seurat
0
gravatar for suvratha
8 weeks ago by
suvratha40
Ulm
suvratha40 wrote:

As I've understood Seurat properly, in the initial steps, it performs scaling and centering of the data so that your data resembles a normal distribution using the ScaleData() function. If this is the case, then why does the default test for differential expression have the Wilcoxon test when it is a non-parametric test? would it not be better off to use DESeq2 instead and trust the results from DESeq2?

Please do correct me if i'm going wrong in my understanding anywhere here.

Thank you. Suvi

seurat rna-seq R • 240 views
ADD COMMENTlink modified 8 weeks ago by ATpoint35k • written 8 weeks ago by suvratha40

If you interpret the hundreds of cells you have per sample as replicates, there shouldn't be much need for the sophisticated modelling that DESeq2 does to overcome the typical limitations of bulk RNA-seq data (namely: lack of replicates). A t-test (or, alternatively, Wilcoxon test) usually works fine if you have hundreds of replicates per gene. That being said, DESeq would use the raw read counts, too, not the scaled data.

ADD REPLYlink written 8 weeks ago by Friederike5.7k

so which results should i report? the one from Wilcoxon test or from DESeq2? Also the number of differentially expressed genes I get from DESeq2 is way more than the number of genes I get from the Wilcoxon test, so I don't know which ones to trust. i.e. - the genes I'm looking into is gets detected only when I use DESeq2 and not Wilcoxon test. So, I don't know what to do.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by suvratha40

in silico there isn't that much you can do at a single-gene level, but if you're interested in just a single gene, I would strongly recommend to look at the expression pattern of your gene of interest in the groups you're comparing. Get an idea of why that gene seems to be borderline DE as it is being missed by one method. That means, looking at both raw counts as well as normalized data may be helpful.

The only way to know whether your gene is biologically important for whatever conditions you're looking at is to set up an appropriate experiment.

ADD REPLYlink written 8 weeks ago by Friederike5.7k

I see, thank you for the explanation. My question still remains - which one do i pick for differential expression testing? Wilcoxon test or DESeq2?

ADD REPLYlink written 8 weeks ago by suvratha40

My argument would be that it does not matter. It is more important to understand why the tests disagree for your specific gene IMO.

ADD REPLYlink written 8 weeks ago by Friederike5.7k

The only way to know that would be by doing what you suggested in your previous comments?

ADD REPLYlink written 8 weeks ago by suvratha40

I'm sure there are more ways, but that's how I would start going about it, yes.

ADD REPLYlink written 8 weeks ago by Friederike5.7k
0
gravatar for ATpoint
8 weeks ago by
ATpoint35k
Germany
ATpoint35k wrote:

The differential testing is performed on the normalized count data, not on the Z-transformed data. FAQ 4 and several GitHub issues briefly this https://satijalab.org/seurat/faq

ADD COMMENTlink written 8 weeks ago by ATpoint35k

so then what's the point of the ScaleData() function?

ADD REPLYlink written 8 weeks ago by suvratha40

Mostly for visualization purposes

ADD REPLYlink written 8 weeks ago by Friederike5.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1642 users visited in the last hour