Seurat to Dataframe with ONLY counts, cells, and cluster labels?
1
0
Entering edit mode
23 months ago
Kind Katydid ▴ 10

I'm trying to run MDSeq, which appears to require a plain old data frame - counts and cells, and with labels for conditions (in my case, it'll be clusters found from Seurat's clustering).

I'm completely lost as to what function, if any, could strip all data/metadata except for the 3 above, and then convert it into a dataframe?

Alternatively, I am using the below script to split it into dataframes for each cluster, could/should I combine these and retain/add cluster labels? Seems less elegant than the other approach.

cluster_list <- levels(data_transcripts_seurat@active.ident)

for(cluster in cluster_list) {
name <- paste("data_cluster_", cluster, sep="")
clustersubset <- subset(data_transcripts_seurat, idents=cluster, slot="counts")
assign(name, as.data.frame(clustersubset@assays$SCT@counts)) }  R RNA-Seq Seurat • 4.4k views ADD COMMENT 2 Entering edit mode Why not start with data_transcripts_seurat@assays$SCT@counts?

Something like:

counts.df <- as.data.frame(data_transcripts_seurat@assays$SCT@counts)  If you need the cells to be rows, you can use the t() function: counts.df <- data_transcripts_seurat@assays$SCT@counts %>% as.matrix %>% t %>% as.data.frame

0
Entering edit mode

Thanks! That's a much better way to do it. Could you please also provide guidance on how the cluster assignment can also be transferred (correctly) to this dataframe?

0
Entering edit mode

what does data_transcripts_seurat@active.ident get you? Shouldn't those be the cluster assignments per cell? [Disclaimer: I do not use Seurat, so I'm basing this off your initial code]

0
Entering edit mode

Hi, yes I think it is! I did not realise this before (also very new to Seurat, and transcriptomics in general). How can I add this to counts.df? Is it with the attributes function?

Also, should I be concerned about the order of cells being different in counts.df and data_transcripts_seurat@active.ident? Are there ways to make sure they are matched?

0
Entering edit mode
18 months ago
Pratik ▴ 850

4 months late I know... but anyways... I was trying to do something similar. Google helped and sort of pieced together this puzzle:

library(dplyr)
counts.df <- data_transcripts_seurat@assays$RNA@counts %>% as.matrix %>% t %>% as.data.frame counts.df <- tibble::rownames_to_column(counts.df, "cellnames") clusterassignemnts <- data.frame(data_transcripts_seurat@active.ident) clusterassignemnts <- tibble::rownames_to_column(clusterassignemnts, "cellnames") counts.df <- merge(clusterassignemnts, counts.df, by = "cellnames") rownames(counts.df) <- counts.df$cellnames
counts.df$cellnames <- NULL colnames(counts.df)[1] <- c("clusters")  Note: I did choose RNA counts versus SCT, you could play around with that I guess. Also there is probably a better simpler way, this is what i found Hope this helps! ADD COMMENT 1 Entering edit mode Thanks! That's a really good way to do what I intended to, For me, however, I gave up and decided to just put the cluster into the object name as opposed to a column, cluster_index <- levels(counts_all@active.ident) # counts_all is my main Seurat object for(i in cluster_index) { name <- paste0("counts_", i) clustersubset <- subset(counts_all, idents=i, slot="counts") assign(name, as.data.frame(clustersubset@assays$RNA@counts))
rm(name, clustersubset)
}