Seurat: Assay to use for differential expression in an integrated data
1
0
Entering edit mode
3.2 years ago
firestar ★ 1.6k

I have a single-cell RNASeq Seurat object integrated using sctransform.

When running FindMarkers(), which assay is to be used? RNA, SCT or integrated? I assume the slot is always "data".

To look at some real data. Running DGE.

f1 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="RNA")
f2 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="SCT")
f3 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="integrated")

The results and DEGs are all different.

> head(f1)
p_val   avg_logFC pct.1 pct.2    p_val_adj
Mmp12  8.656001e-23 -17.9039715 0.001 0.111 1.417853e-18
Cxcl2  2.153860e-18 -14.1067209 0.002 0.111 3.528023e-14
Xylt1  6.461056e-17  -0.6137400 0.010 0.222 1.058321e-12
Rhebl1 5.547881e-15  -0.3580077 0.018 0.278 9.087429e-11
Lpl    1.582957e-14  57.4116369 0.047 0.444 2.592883e-10
Tgfb2  1.202214e-13  -1.4634266 0.007 0.167 1.969227e-09
> head(f2)
p_val  avg_logFC pct.1 pct.2    p_val_adj
Hp      7.286762e-30 -0.2764460 0.011 0.333 1.193572e-25
S100a11 9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ifitm2  9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ccr2    9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ahnak   9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Cybb    1.039943e-27 -0.4562356 0.018 0.389 1.703426e-23
> head(f3)
p_val  avg_logFC pct.1 pct.2    p_val_adj
Gas2l3  3.419395e-11  1.7086466 0.126 0.944 1.025819e-07
Cxcr4   2.159168e-09 -0.3924339 0.022 0.500 6.477503e-06
Kif2c   3.049359e-09 -0.5152160 0.119 0.722 9.148077e-06
Rps18   6.295892e-09 -0.8022487 0.034 0.167 1.888768e-05
mt-Atp6 2.352245e-08 -0.5896836 0.167 0.778 7.056735e-05
Rps8    2.648383e-08 -0.6596474 0.018 0.167 7.945150e-05

I picked two genes which were found in all three results to show how the fold change varies.

> f1[c("Cxcl2","Lpl"),]
             p_val avg_logFC pct.1 pct.2    p_val_adj
Cxcl2 2.153860e-18 -14.10672 0.002 0.111 3.528023e-14
Lpl   1.582957e-14  57.41164 0.047 0.444 2.592883e-10
> f2[c("Cxcl2","Lpl"),]
             p_val  avg_logFC pct.1 pct.2    p_val_adj
Cxcl2 2.060274e-18 -0.3262467 0.002 0.111 3.374729e-14
Lpl   2.822407e-14 -1.3114768 0.038 0.389 4.623102e-10
> f3[c("Cxcl2","Lpl"),]
           p_val  avg_logFC pct.1 pct.2 p_val_adj
Cxcl2 0.00328671 -27.233063 0.050 0.167         1
Lpl   0.44693810  -5.408191 0.117 0.556         1
seurat 10x single-cell RNA-Seq • 6.7k views
ADD COMMENT
2
Entering edit mode
3.2 years ago

You shouldn't use SCT and integrated counts for anything outside of dimension reduction and clustering. Before differential expression you can run NormalizeCounts on the RNA assay.

Relevant Seurat links:

ADD COMMENT
0
Entering edit mode

Thanks for the reply and links. By NormalizeCounts, I guess you mean NormalizeData. Also, why normalised data? Shouldn't I be using raw counts with covariates?

ADD REPLY

Login before adding your answer.

Traffic: 1701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6