Dear Community!
This is my first post here and I thank you in advance for your input. I have a 10x single nucleus dataset of mouse brain tissue, and after filtering for good quality nuclei, I would like to get rid of the ambient RNA contamination. For this effect, I used SoupX as I have seen in many publications before.
However inspecting my soup table I have found that there are many nucleus-based transcripts categorized as "soup", including Malat1.
I used soupX with default settings, and I didn't specify what transcripts could be contamination. Do you think that soupX could be unfit for single nucleus data, or can I somehow tweak it for giving me more biologically relevant results?
Thank you (:
Ambient RNA is such RNA that floats in the suspension due to damaged cells or nuclei releasing their RNA content, and this free RNA then gets encapsulated into the GEMs. It is not surprising that you see Malat1 since it is highly-expressed by basically all cells, is it? The correction is basically subtracting the background levels of RNA in the suspension. I am not sure why you think this is an issue.
For some reason I expected cytoplasmic transcripts to be considered as soup, rather than purely nuclear ones - but your comment makes it clear that any transcript can be ambient in the end. Thank you! (: