regressing out covariate from Seurat integrated data
1
3
Entering edit mode
17 months ago

Hi, I'm analyzing the data of 8 different individual (two regions of the brain of each individual) which in total I have 16 dataset. I see that PCA clustering is a bit driven by individuals (two regions from each individual cluster together) although the effect of affected status and regions is way more stronger than individual. Anyway, I would like to regress out the individual covariate. But I'm not sure at which step I should do so? As it is not recommended to regress out these covariate using SCTransform I was wondering should I regress them out after integration? and what's the code? whatever I found was for batch correction. I really appreciate if you help me with this and let me know if you know the code for it. Thanks, Paria

single-cell seurat • 2.2k views
0
Entering edit mode

Which Seurat workflow are you following. SCTransform + integration?

0
Entering edit mode

Yes, I do SCTransform+integration

0
Entering edit mode

Thanks for your comment. I see what you mean but it is not recommended in most of the forum I read. I also tried it and was getting an error which I couldn't fix. It is in the below post: C: How to regress out variants using SCTransform function?

Regarding the second approach, I do have 16 object from 8 individual. Actually, I don't see batch effect. So, I think it's better to not torture my data. So, I'm wondering if there is any way to do so after integration or before it other than SCTransform or if I could do it using SCTransform (I couldn't do so so far)

3
Entering edit mode
17 months ago
fracarb8 ▴ 810

Why do you think it is not recommended?

The sctransform_vignette clearly states that "we can also remove confounding sources of variation" .

I wouldn't worry too much and I would run SCTransform(myObject, vars.to.regress = "SampleID", verbose = TRUE) (assuming SampleID is where your sample names are stored).

If you don't want to regress them out, I often use a different approach, that allow you to not worry about batch effect coming from different samples:

1) create 1 object per sample (In your case, you would have 8 objects each containing the two regions)

2) create a list containing all the objects

3) Integrate the objects together (FindIntegrationAnchors + IntegrateData)

SCTransform and integration as two different things. SCTransform is a way of normalise and scaling, while integration is the process of combining datasets that are already normalised and scaled on their own. You can regress out confounding with SCTransform, and you don't need to worry about it while integrating.

In the case presented @paria.alipour you can use 2 approaches:

A) combine all counts together and treat them as 1 dataset

In this case, which is not what @paria.alipour is doing, you need to regress out the samples SCTransform(myObject, vars.to.regress = "SampleID", verbose = TRUE)

B) integrate the different samples

Depending on how you decide to group the samples, you may need to regress out or not

1st step: run SCTransform(myObject, verbose = TRUE).

As all the cells comes from 1 single sample you don't need to regress the sample, but you could regress out other sources of variation (region, percentage of mito or any other things that you know could have an effect).

In this case I would account for the region of the brain (region), even if PCA suggests there is no huge effect. SCTransform(myObject, vars.to.regress = "region", verbose = TRUE) or SCTransform(myObject, vars.to.regress = c("region","mt.percentage"), verbose = TRUE)

2nd step: Integrate the objects together (FindIntegrationAnchors + IntegrateData)

You don't need to care about regressing, because you did it already in the previous steps.

0
Entering edit mode

If you're following the SCTransform + integration workflow you can't regress on sample identity, since there will be only one factor level.

0
Entering edit mode

Do you recommend any other way which I could regress out this covariate?

0
Entering edit mode