A brief question regarding regressing out cell cycle effects or ribosomal/mitochondrial genes
1
1
Entering edit mode
17 months ago
cook.675 ▴ 60

Im using the Seurat platform on some data but I think the question applies regardless of platform:

I'm seeing some PC's grouped on cell cycle, and also notice hetergenous cycle stage grouping of cells in the PC plot

If I call vars.to.regress within ScaleData to regress these out, does this only affect the clustering? The between cluster DEG list works on the raw counts right, so regressing out these genes will only cause the cells to NOT cluster based on these genes, and wont affect the DEG analysis once those clusters are set correct?

If I am correct, which other analyses in a general pipeline would regressing out genes affect? I hear it be said that if the feature you are regressing is a biological feature of interest, then you should leave it in.

In my case, cell cycle is a biological feature of interest, but I do not want the cell clustering to be affected by the cell cycle stage. If thats the case, regressing out so they don't cluster on this feature then using the raw data to look at the cell cycle stage after clustering should be fine right? The data could even be re-scaled after clustering if it needs to be with these genes now included, and the cells would retain the cluster IDs right?

Thanks in advance!

seuart scRNA seq • 1.4k views
ADD COMMENT
0
Entering edit mode
17 months ago

Your interpretation is fairly correct. Most folks use the raw scaled counts for differential expression, and the regression should only impact clustering and dimensionality reduction (i.e. you shouldn't have clusters of purely S phase cells off on their own in your TSNE/UMAP). And yes, you could re-scale and re-do dimensionality reduction, but you'd be changing the data from which the clusters were defined, which seems like a muddy idea at best.

In your case, since you're interested in cell cycle, I'd likely try both leaving it in and regressing it out. In fact, it may help you to identify subclusters that are biologically interesting and clarify cell types (proliferating vs resting cells of the same type, etc).

ADD COMMENT
0
Entering edit mode

Is it fair to regress out batch effect on only highly variable genes only ?

ADD REPLY
1
Entering edit mode

Probably not, as it could change which genes are considered "highly" variable.

ADD REPLY

Login before adding your answer.

Traffic: 2058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6