I have a noob question - When finding clusters and producing dimensionality reduction (UMAP for example) in an experiment, is it relevant to regress out the sex of an individual? I don't really see a separation by sex in my samples, but they are so many that maybe there is some and it's just not that obvious. Or maybe these functions (FindClusters, RunUMAP) already take this in consideration??

I am just wondering how much variation could be coming from the sex chromosomes and if people usually take this in consideration (regress it out) since it's not covered in the tutorials seurat provide.

I can just try both and see how different they are, but I am looking for an experienced opinion so that I don't just settle my opinion from what I can see in this dataset (or if I see almost or the same reduction - is it important to keep the regressed version?).

If you do usually regress this out, is the RegressOut function from Seurat the way to go? If so, could you elaborate on how you scale the data after that?

Thank you!

If you have multiple samples, I suppose you're using some sort of batch correction / integration workflow (maybe even Seurat integration?). Depending on your workflow, the inter-sample differences might already be discarded by the batch correction / integration procedure. What you need to do is look at the sex differences between the samples on the UMAP before and after integration / batch correction. If some clusters do not align between female and male samples, is that what you expect based on the tissue under study? If it's not, try doing pseudo-bulk DGE testing between males and females, maybe there's some unknown biology you've just discovered.


