So I've been trying to learn some Seurat recently and while working on some data I noticed that there's a discrepancy between the library size for a given cell in my raw count matrix (i.e. unnormalized and including only those genes not filtered out by Seurat when creating the Seurat object) and the library size for the same cell in the seurat_object[['RNA']@counts object, i.e. the cell in the counts object always seems to have a bigger library size for some reason. I couldn't find anything in the documentation that describes this and I don't know how to view the source-code since printing out the function in the R console just gives me UseMethod(generic = "NormalizeData", object = object) Any of you guys know what might be happening?
Yeah you're right. I checked the library size ratios for all cells in my raw matrix (with all genes) and the seurat counts and it's mostly one or a bit below one. So I guess that it store the original library size for normalizing the values, which makes sense, although I have no idea why it sometimes stores a library size slightly lower. I briefly looked at some of the code in their github that you kindly linked, but I think I will just trust that Seurat does something clever here...