I have a CropSeq dataset to analyze, and I am confused on whether I should regress out cell cycle or not for downstream identification of clusters, differentially expressed genes, and gene set enrichment analysis.
According to Seurat's standard single-cell analysis, it is advisable to use SCtransorm and regress out the cell cycle effects in order to proceed with downstream analysis (https://satijalab.org/seurat/archive/v3.1/cell_cycle_vignette.html)
However, according to the paper which pioneered CropSeq data analysis https://www.sciencedirect.com/science/article/pii/S0092867416316609, and the respective github repository https://github.com/mhorlbeck/ScreenProcessing, the authors just identify the cell cycle phase of each cell, but they don't regress out the cell cycle effect when normalizing the data.
So my question is: Should I regress out cell cycle effects in my CropSeq data analysis or not?
Thank you very much in advance. Eleni
It depends on whether cell cycle is a confounder in your experiment and whether regression would remove interesting biology. Explore your data. Assign a cell cycle state to each cell and then check whether it appears that this introduces a bias. A bias could be that one celltype is divide into two clusters based on cell cycle, and that this is maybe even true for the entire dataset. In contrast, if you have a number of celltypes, and some of them are simply proliferating very well while others are quiescent then removing that effect might remove interesting biology. Can you add some details?
Thank you ATpoint for the suggestion. I will try this out, it makes total sense. I will see how it goes and come back with more details if needed.