There is more than one way to skin a cat.

So is with Seurat. There is more than one way you can analyze your scRNASeq data using Seurat. And mostly it is guided by the data you have in hand. Given two normalization strategies that Seurat provides i.e `lognormalization`

and `SCT`

the analysis regimens can be classified as follows:

Say you have two scRNASeq samples `s_ctrl`

and `s_treat`

. And you wish to carry out Differential Expression analysis post proper cell clustering.

Now you can possibly ** skin** your scRNASeq data in following ways:

**LogNormalization**

- Merge
`s_ctrl`

and`s_treat`

matrix and perform logNormalisation on this concatenated matrix and perform clustering and other down stream analysis. - Perform logNormalisation separately on
`s_ctrl`

and`s_treat`

matrix and then merge the two matrix and perform clustering and other down stream analysis. - Integrate the
`sctrl`

and`s_treat`

samples by separately performing the logNormalisation on each matrix and following standard Seurat protocol to carry out further data analysis.

**SCT Normalization**

- Merge
`s_ctrl`

and`s_treat`

matrix and perform SCT Normalization on this concatenated matrix and perform clustering and other down stream analysis. - Perform SCT Normalization on
`s_ctrl`

and`s_treat`

matrix separately and then merge them both to perform clustering and other down stream analysis. - Integrate the
`sctrl`

and`s_treat`

samples by separately performing the SCT Normalization on each matrix and following standard Seurat protocol to carry out further data analysis.

Strategies 3 and 6 are clearly discussed in Seurat Integration Workflow here. However, such a clarity has not been offered as to when `merging`

is appropriate and when `integration`

. Some explanation has been offered by HBCTraining material here which states that:

Generally, we always look at our clustering without integration before deciding whether we need to perform any alignment. Do not just always perform integration because you think there might be differences - explore the data.

and

Condition-specific clustering of the cells indicates that we need to integrate the cells across conditions to ensure that cells of the same cell type cluster together.

Also, `integration method expects “correspondences” or shared biological states among at least a subset of single cells across the groups.`

Now, let's assume that our `s_ctrl`

and `s_treat`

overlaps fairly in UMAP and there is no condition specific clustering (or stacking) being observed when we merged the matrix and performed the clustering. Which strategy out of 1, 2, 4, 5 is appropriate for our data. No systematic efforts has been made until recently (a paper in bioRxiv) to address that question and the question has remained unaddressed in the below given `seurat issues`

and `biostars posts.`

**GitHub Issues:** Issue 1, issue 2, issue 3, issue 4, issue 5

**Biostars Issues:** Post 1, Post 2

The bioRxiv paper mentioned above discuss the abovementioned 4 strategies and observe over-merging when using `SCTransform`

both strategies 4 and 5 as shown below and finds strategy 2 most appropriate. The code use by the paper is shared here. But I wish to understand and gather thoughts from the `scRNASeq`

community which approach works well and when and invite them for further discussion on this neglected yet important data analysis approach that affects downstream analysis.