Question: Single cell DEG analysis. Wildly different results from different FindMarkers test
0
gravatar for jevanveen
5 months ago by
jevanveen20
jevanveen20 wrote:

Hello and thank you in advance for your help.

I am doing a single cell RNA seq experiment and analyzing data with Seurat. I have four treatment groups, which I integrated, then identified clusters of different types of cells in the brain. Everything looked exactly right at this point. Then I wanted to test for DEGs within each cluster. Using the standard Seurat workflow of switching default assay to "RNA" and then using FindMarkers(), I get kind of strange results. Here is a sample volcano plot from one cluster:

Volcano plot 1

If I switch to a statistical method in FindMarkers (like negbinom) that uses RNA counts instead of scaled data, things look much more as I would expect. Here is the same cluster, tested using counts as data.

Volcano plot 2

I have triple checked that I am performing the Seurat workflow as per the tutorial here: https://satijalab.org/seurat/v3.1/immune_alignment.html

It just seems so odd to me that when I do DEG testing the apparently correct way, I get huge LogFC's and also a weird looking volcano plot with many many significant (padj < .05) DEGs with logFC close to zero.

Has anyone encountered a similar issue? Any help would be great. Thank you!

seurat rna-seq • 689 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by jevanveen20
2

If I switch to a statistical method in FindMarkers (like negbinom) that uses RNA counts instead of scaled data, things look much more as I would expect. Here is the same cluster, tested using counts as data.

FindMarkers does not use scaled data. For most test, such as "wilcox", the default slot is "data", which is normalized counts. Using "counts" (raw counts) does not make sense in that case.

ADD REPLYlink written 5 months ago by igor12k

Sorry Igor, you are of course correct that the "data" slot is log normalized, and NOT scaled.

But when I do it the correct way, using the normalized data slot, things look wrong. When I use counts, which is only recommended for tests like negbinom or DESeq2, the results look much more correct, irrespective of the statistical test that I use.

I was mostly wondering if people had encountered similar results - I haven't seen people do many volcano plots on single cell data.

ADD REPLYlink written 5 months ago by jevanveen20

You posted 4 commands and 2 plots. I am not sure which command corresponds to the first plot, but the logFCs are above 20. Something seems wrong there regardless of the test used. The test should only define the p-value.

ADD REPLYlink written 5 months ago by igor12k
1

You really ought to spell out the two different findMarkers commands

ADD REPLYlink written 5 months ago by swbarnes29.4k

Right you are! Commands that will give data producing the first plot:

FindMarkers(all.cells.integrated, ident.1 = "Neuron_LowEarthOrbit", ident.2 = "Neuron_Underwater", test.use = "wilcox")
FindMarkers(all.cells.integrated, ident.1 = "Neuron_LowEarthOrbit", ident.2 = "Neuron_Underwater", test.use = "LR")

Commands that will give the second plot:

FindMarkers(all.cells.integrated, ident.1 = "Neuron_LowEarthOrbit", ident.2 = "Neuron_Underwater", test.use = "negbinom")
FindMarkers(all.cells.integrated, ident.1 = "Neuron_LowEarthOrbit", ident.2 = "Neuron_Underwater", test.use = "wilcox", slot = "counts")
ADD REPLYlink modified 4 months ago by GenoMax95k • written 5 months ago by jevanveen20
1
gravatar for jevanveen
5 months ago by
jevanveen20
jevanveen20 wrote:

I partly figured this out and thought I would update in case someone else comes looking with the same issue.

It seems that I was getting crazy LogFC's and weird looking volcanoes because of a problem with the normalization of my data. For these analyses I had used the Seurat "SCT integration" pipeline as outlined here: https://satijalab.org/seurat/v3.2/integration.html

I probably did something wrong, but after triple checking the workflow, I couldn't find it. When I switched back to an integration workflow that includes log normalization, rather than SCT normalization, everything appears to be fixed.

So in the end, I am unsure of why the SCT integration pipeline failed on me, but switching to log normalization was the solution to my problem.

ADD COMMENTlink written 5 months ago by jevanveen20

SCT Normalization stores the normalized counts in the SCT assay under the counts slot. Because of this, the counts slot is not changed in the RNA assay.

ADD REPLYlink written 4 months ago by rpolicastro3.2k

To avoid any potential confusion, you can specify both slot and assay in FindMarkers.

ADD REPLYlink modified 4 months ago • written 4 months ago by igor12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1394 users visited in the last hour
_