subscript out of bounds in seurat expression matrix [SOLVED]
0
0
Entering edit mode
3.7 years ago

Hello

I'm trying to build a expression matrix to use as input in a different heatmap tool used on Seurat. For that, I want to take the 500 DEGs with lowest adjusted p-value.

here is what I tried:

DEGs <- all.markers %>% top_n(n = -500, wt = p_val_adj)
DEGs.genes <- DEGs$gene
DEGs.genes <- unique(DEGs.genes)
integratedexpression <- as.matrix(GetAssayData(circBALF.pred.integrated.CD4, assay = "integrated"))
integratedexpression.filtered <- integratedexpression[DEGs.genes, ]
annotations <- circBALF.pred.integrated.CD4@meta.data
annotations <- t(annotations)

the 5th line gives me the following error message:

Error in integratedexpression[DEGs.genes, ] : subscript out of bounds

this does not occur if I extract SCT expression values, it runs perfectly.

What should I do? The format of the matrix seems to be exactly the same. I searched the error and from what I understand the number of rows of the integrated matrix is smaller than 500, which is not the case.

seurat • 6.6k views
ADD COMMENT
0
Entering edit mode

When you did the marker analysis, what assay and slot did you use? Some of the slots hold all genes that pass quality filtering, while others only hold the top 2-3k most variable genes.

ADD REPLY
0
Entering edit mode

I used the RNA assay but did not specify the slot, and I assume the default is "data"

additionally, when I used these genes with the Doheatmap function using the integrated data assay, many genes indeed are left out, but the heatmap is still generated

ADD REPLY
0
Entering edit mode

I can't remember 100% off the top of my head, But I think all of the RNA slots, as well as the counts and data slots of SCT contain all genes that pass quality filtering. On the other hand, SCT scale.data, and all (or most) of the integration assay slots only have the top 2-3k most variables genes. You can double check for each slot by getting the number of rows in the matrix, since rows are the genes.

ADD REPLY
0
Entering edit mode

I believe you are correct, but still

when I run do heatmap using the same list of genes (DEGs$gene) with the integrated assay, the heatmap is generated.

DoMultiBarHeatmap(circBALF.pred.integrated.CD4, assay = 'integrated', features = DEGs$gene, group.by='integrated_snn_res.0.5', additional.group.by = 'Patient.status', additional.group.sort.by = 'Patient.status') + theme(text = element_text(size = 5))

I do get a warning message of a list of genes that are not present though

  The following features were omitted as they were not found in the scale.data slot for the integrated assay:
ADD REPLY
1
Entering edit mode

The reason integratedexpression.filtered <- integratedexpression[DEGs.genes, ] is not working now is because not all of the genes you are trying to subset are in the rownames of the matrix. You can ignore the genes that are absent by instead doing this.

integratedexpression.filtered <- integratedexpression[rownames(integratedexpression) %in% DEGs.genes, ]
ADD REPLY
0
Entering edit mode

this worked! thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6