I am envisioning a project that requires removal of ambient RNA counts from cell-containing droplets. On the other hand, I will also need to aggregate counts from individual mice into "pseudobulk" profiles, to use as input to between-condition* DE analysis in DESeq2. (*: between-condition DE as opposed to DE for detecting celltypes—for the latter I simply use a method such as FindMarkers in Seurat)
I have used SoupX in the past for the ambient RNA removal, but I cannot use it with DESeq downstream because DESeq will accept a counts matrix with integer values only.
I am wondering if there are any ambient RNA-correction methods that produce integer counts. If anyone has familiarity with one or more of the following ambient-RNA correction methods, and could tell me which, if any, manage to remove whole counts (entire UMIs), I would be very appreciative. These are some of the methods I am checking one by one. Other methods I'm not thinking of would be good too!
Alternatively... would anyone with a statistical background be able to chime in on whether rounding the SoupX-generated counts matrix into whole numbers could be an acceptable input for the purpose of DESeq? Cells are being pseudobulked; i.e., all the cells that came from one sample—as identified by antibody-based hashtagging—will be summed together to generate one counts matrix for that sample. Which I think might remove any unique concerns that have to do with zero-counts in individual cells. But I am not confident about that.
Thank you in advance!