Question: RaceID3 using 10x datasets
0
gravatar for Seigfried
12 months ago by
Seigfried70
Seigfried70 wrote:

Hello I wish to cluster my single cell 10x data using RaceID3. However, I cannot load my 10x data into RaceID using their function SCseq

10x gave me 3 files: 1) barcodes.tsv.gz 2) features.tsv.gz 3) matrix.mtx.gz

I used Seurat's Read10X function :

library(Seurat)
library(RaceID)

pbmc.data <- Read10X(data.dir = "C:/Users/s/Downloads/")

sc <- SCseq(pbmc.data)

Here is my pbmc.data

> pbmc.data
33694 x 27179520 sparse Matrix of class "dgCMatrix"

This is the error i get :

sc <- SCseq(pbmc.data)
Error in asMethod(object) : Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105

I also tried using the Matrix package in R.

library(Matrix)
matrix_dir = "C:/Users/s/Downloads/"
barcode.path <- paste0(matrix_dir, "barcodes.tsv.gz")
features.path <- paste0(matrix_dir, "features.tsv.gz")
matrix.path <- paste0(matrix_dir, "matrix.mtx.gz")
mat <- readMM(file = matrix.path)
feature.names = read.delim(features.path, 
                       header = FALSE,
                       stringsAsFactors = FALSE)
barcode.names = read.delim(barcode.path, 
                       header = FALSE,
                       stringsAsFactors = FALSE)
colnames(mat) = barcode.names$V1
rownames(mat) = feature.names$V1

And it fails to allocate a huge amount of memory

> sc <- SCseq(mat)
Error: cannot allocate vector of size 6823.1 Gb

I understand that RaceID requires a sparse matrix which I am already providing. Can someone please explain?

single cell raceid 10x • 849 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by Seigfried70
1
gravatar for Devon Ryan
12 months ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

RaceID is requesting about 7TB RAM to load that dataset, which is pretty much guaranteed to be more than you have. I can tell you from experience that RaceID3 does not currently scale well with 10x-scale data, so in addition to needing a absurd RAM amounts it'll need a LOT of time to run. I recommend switching to something else for this kind of data.

ADD COMMENTlink written 12 months ago by Devon Ryan98k

Thank you for your reply @Devon Ryan

The count matrix I am using is currently a "cellranger aggregate" of 4 different samples. I tried using Seurat for clustering but since my samples are cell culture samples with differing conditions, they do not cluster well.

I could run this on a single sample but then it would defeat the purpose of identifying cell lineages.

Could you please recommend any other tools I can use to effectively do this? Currently trying out Slingshot.

Wishing you a Happy New Year and Decade!

ADD REPLYlink written 12 months ago by Seigfried70

Play with the parameters in Seurat more, including how you're dealing with batches (i.e., samples). You can also try things like scanorama and scanpy.

ADD REPLYlink written 12 months ago by Devon Ryan98k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1232 users visited in the last hour
_