Hi all,

I've been using Seurat for multi-sample RNA-Seq data as described in this tutorial:

https://satijalab.org/seurat/v3.0/immune_alignment.html

i.e. creating subsets for each sample then performing integration of the subsets.

Elsewhere in the Seurat docs though SCTransform is described and recommended instead of using the usual NormalizeData, ScaleData, and FindVariableFeatures functions.

When following the subset & integrate model though, NormalizeData and FindVariableFeatures are performed only on the subsets, whilst ScaleData is applied to the integrated data, so I'm wondering if SCTransform is compatible with multi-sample data or if I should just stick to using the three functions (NormalizeData, ScaleData, and FindVariableFeatures) individually?

Many thanks,

Steve

I don't fully understand why one couldn't do the integration on the Pearson residuals; with the recent release they're being returned (as "corrected counts", I believe), so I'd assume one could use them in the recommended way?

Looks like there is a new

`PrepSCTIntegration()`

function: https://github.com/satijalab/seurat/blob/30f0df6b979cb61df0f093ce8eea06c1caebd024/R/integration.R#L1139-L1257I had the same concern, but haven't had time to look into it in more depth to see what I am missing. At this point, I am assuming the "official" protocol will be posted at any moment.

thanks Igor and Friederike, your replies are very helpful.

Just to clarify, my samples are all from the same batch, but the vignettes you point to are still applicable, particularly the one you provided here Igor. It seems to me that whilst it's possible to use SCTransform in this context, it's not currently obvious or intuitive how to do it - for example the question of whether or not to run ScaleData following integration. I think I'll just use the three functions individually for now, until the developers have completed their vignette on combining sctransform with Seurat v3 integration.

Why do you think you need the integration step? If there's no obvious batch effect, I would just run SCTransform and call it a day.

yes, I think you're right, I don't need to do that. What I'm really interested in is being able produce various plots which are either grouped or split by sample, and I was following the steps in that tutorial because it seemed to show how to do that. However in their case their two samples are from different batches, hence the separation of the NormalizeData, FindVariableFeatures and ScaleData steps.

In my case, I think all I need to do is use the AddMetaData function to label cells differently on the whole dataset, then I can as you say just apply SCTransform to the whole thing.

Many thanks!

Yes, I'm fairly confident that that should work!