Copy the 'orig.ident' values to a new 'idents' column in the metadata

Question

SoupX with scDblFinder, matrix error

0

Entering edit mode

2.5 years ago

sushisena • 0

I have a dataset from a paper and I wanted to run it seurat pipeline. But before I ran SoupX on the samples:

P018_1.data <- Read10X(data.dir = "/fast/scratch/users/cetinsz_c/GSE199321/UR_AKI_P018.1/outs/filtered_feature_bc_matrix")

P018_1 <- CreateSeuratObject(counts = P018_1.data, project = "P018.1", min.cells = 3, min.features = 200) sc_P018_1 = load10X("/fast/scratch/users/xxxx/GSE199321/UR_AKI_P018.1/outs/") sc_P018_1 = autoEstCont(sc_P018_1, tfidfMin = 0.6) out_P018_1 = adjustCounts(sc_P018_1, roundToInt=T) P018_1@assays$RNA@data@x <- out_P018_1@x

I had 20 samples like this plus 6 pool samples. Next, I wanted to run the scDblFinder

load seurat objects from demultiplexed samples (see "AKI_Urine_Sediment_Demultiplexing_pooled_samples_script") ----

Pool1 <- readRDS("/fast/scratch/users//xxx/akiproject/AKI_Urine_sediment_Pool1.rds") Pool2 <- readRDS("/fast/scratch/users//xxx/akiproject/AKI_Urine_sediment_Pool2.rds") Pool3 <- readRDS("/fast/scratch/users//xxx/akiproject/AKI_Urine_sediment_Pool3.rds") Pool4 <- readRDS("/fast/scratch/users//xxx/akiproject/AKI_Urine_sediment_Pool4.rds") Pool5 <- readRDS("/fast/scratch/users/xxx/akiproject/AKI_Urine_sediment_Pool5.rds") Pool6 <- readRDS("/fast/scratch/users/xxx/akiproject/AKI_Urine_sediment_Pool6.rds")

Copy the 'orig.ident' values to a new 'idents' column in the metadata

Pool4@meta.data$idents <- Pool4@meta.data$orig.ident

Pool6@meta.data$idents <- Pool6@meta.data$orig.ident

Identify the cells with the idents value "P054"

cells_to_remove <- rownames(Pool4@meta.data[Pool4@meta.data$idents == "P054", ])

Subset the Seurat object using the 'subset' function

Pool4 <- subset(Pool4, cells = colnames(Pool4)[!colnames(Pool4) %in% cells_to_remove])

Identify the cells with the idents values "P116", "P118", and "P120"

cells_to_remove_Pool6 <- rownames(Pool6@meta.data[Pool6@meta.data$idents %in% c("P116", "P118", "P120"), ])

Subset the Seurat object using the 'subset' function

Pool6 <- subset(Pool6, cells = colnames(Pool6)[!colnames(Pool6) %in% cells_to_remove_Pool6]) ``

make a list of all objects, determine percentage of mitochondrial RNA and doublets per sample. ----

URINEList <- list(P005, P006, P007, P017_1, P017_2, P018_1, P018_2, P019_1, P019_2, P021, P022, P023_1, P023_2, P023_3, P023_4, P024_1, P024_2, P001, P002_1, P002_2, P003, Pool1, Pool2, Pool3, Pool4, Pool5, Pool6)

URINEList <- list(P005, P006, P007)

does not work:Assuming the input to be a matrix of counts or expected counts.

Error in validObject(result) : invalid class “dgCMatrix” object: 'i' and 'x' slots do not have equal length

for (i in 1:length(URINEList)) { URINEList[[i]][["percent.mt"]] <- PercentageFeatureSet(URINEList[[i]], pattern = "^MT-") doublets <- scDblFinder(GetAssayData(URINEList[[i]], assay = "RNA", slot = "data")) doublets <- as.vector(doublets@colData@listData[["scDblFinder.class"]]) URINEList[[i]]@meta.data$multiplet_class <- doublets URINEList[[i]]@project.name <- levels(URINEList[[i]]@active.ident) URINEList[[i]] <- RenameCells(URINEList[[i]], add.cell.id = paste0(URINEList[[i]]$orig.ident, "_")) }

The error I get: Error in validObject(result) : invalid class “dgCMatrix” object: 'i' and 'x' slots do not have equal length

Info about the URINEList: Dimensions of the input matrix: [1] 23331 1459 Number of non-zero elements in the input matrix: [1] 2600061 Class of the input matrix: [1] "dgCMatrix" Length of 'i' slot: [1] 2846422 Length of 'x' slot: [1] 2872704

I tried another code:

for (i in 1:length(URINEList)) { assay_data <- GetAssayData(URINEList[[i]], assay = "RNA", slot = "data")

URINEList[[i]][["percent.mt"]] <- PercentageFeatureSet(URINEList[[i]], pattern = "^MT-")

doublets <- scDblFinder(as.matrix(assay_data)) doublets <- as.vector(doublets@colData@listData[["scDblFinder.class"]]) URINEList[[i]]@meta.data$multiplet_class <- doublets

URINEList[[i]]@project.name <- levels(URINEList[[i]]@active.ident) URINEList[[i]] <- RenameCells(URINEList[[i]], add.cell.id = paste0(URINEList[[i]]$orig.ident, "_")) }

This is the error I get this time:

Creating ~5000 artificial doublets... Dimensional reduction Evaluating kNN... Training model... iter=0, 399 cells excluded from training. iter=1, 227 cells excluded from training. iter=2, 220 cells excluded from training. Threshold found:0.514 96 (2%) doublets called Error in .checkSCE(sce) : sce should be a SingleCellExperiment, a SummarizedExperiment, or an array (i.e. matrix, sparse matric, etc.) of counts. In addition: Warning message: In asMethod(object) : sparse->dense coercion: allocating vector of size 3.5 GiB

Is there anyway to fix this problem? I assume this happens because SoupX maybe deletes cells or genes before I find the doublets in the dataset because when I don´t run soupx, it does not give me this error.

r scdblfinder seurat soupx • 1.3k views

ADD COMMENT • link updated 2.5 years ago by ATpoint 89k • written 2.5 years ago by sushisena • 0

0

Entering edit mode

Please format this wall of code (the 10101 button), remove unnecessary code and describe what exactly the problem is.

ADD REPLY • link 2.5 years ago by ATpoint 89k