single cell: differential expression between cluster subsets
1
0
Entering edit mode
11 weeks ago
Lee • 0

Hello,

I'm currently running a single cell analysis, and I have question that I would like to check whether it makes sense statistically, or maybe I'm missing something.

So in Seurat we can do differential expression (DE) analysis between clusters (Cluster1 vs Cluster2) or within Clusters (Cluster1_Ctrl vs Cluster1_Treated). That's all good.

However the user keeps requesting for a cluster subset vs another cluster subset DE analysis, e..g

  1. Cluster1_Ctrl vs Cluster2_Ctrl
  2. Cluster1_Treated vs Cluster2_Treated

I've tried searching here and other places but couldn't find anything. Does this make sense, statistically? If not, why? Or is there a way to run this kind of analysis in Seurat that I'm missing?

Thank you in advanced for your help and opinions!

scrna-seq statistics seurat • 6.8k views
ADD COMMENT
0
Entering edit mode

Thank you, I will check out the links. Although I'm not certain that pseudobulking is the issue here, I've run DE pseudobulking before. What I haven't tried is what the user is requesting, comparing a subset of Cluster1 vs a subset of cluster2.

ADD REPLY
0
Entering edit mode

Thanks for clarifying. I missed the "subset" part from your question. What criteria will you be using to subset the data? Can you share the basis of this odd request.

ADD REPLY
0
Entering edit mode

It's just as it is. Actually it's not Ctrl vs Treated - it's Wild Type vs Mutant. Almost each cluster contain an overlap of WT and Mutant cells.

The user just really really wants to compare cells Cluster1_WT vs Cluster2_WT, and same for Mutant.

I think I've come up with a possible solution. Split the seurat object into 2 - "WT" and "Mutant". Then run clustering separately for each object. After that the user can compare Cluster1 vs Cluster2 as much as they want.

ADD REPLY
0
Entering edit mode
12 days ago
Kevin Blighe ★ 90k

Yes, the analysis that you describe makes sense statistically. In single-cell RNA sequencing, clusters typically represent distinct cell populations or states. Comparing differential expression between subsets of cells from different clusters under the same condition (for example, Cluster1_WT versus Cluster2_WT) is equivalent to testing for gene expression differences between those populations within that condition. This is valid as long as each subset contains sufficient cells for reliable statistical testing, and the clustering is robust.

The potential issue is not statistical invalidity, but interpretation. If the clusters were identified using all cells (including both wild-type and mutant), the cluster assignments already account for condition-related differences to some extent. However, subsetting by condition and then comparing clusters isolates the comparison to condition-specific differences between cell populations.

In Seurat, you can perform this analysis without splitting the object and re-clustering, which risks altering cluster definitions. Instead, proceed as follows:

# Subset to wild-type cells
WT <- subset(YourSeuratObject, subset = Condition == "WT")

# Set cluster identities
Idents(WT) <- "seurat_clusters"  # or your cluster column

# Run differential expression between Cluster1 and Cluster2
DE_WT <- FindMarkers(WT, ident.1 = "1", ident.2 = "2", test.use = "wilcox")  # adjust test as needed

Repeat the process for mutant cells. This approach uses the original clustering while restricting to the condition of interest.

If cell numbers in subsets are low, consider pseudobulking (via aggregate expression across cells in each cluster-condition combination) before differential expression to improve power, but this is optional.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 3281 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6