Memory Error when running scrublet
0
0
Entering edit mode
21 months ago
Assa Yeroslaviz ★ 1.7k

Hi, I'm getting the following error, when trying to run my file

>>> doublet_scores, predicted_doublets = scrub.scrub_doublets(min_counts=2,
...                                                           min_cells=3,
...                                                           min_gene_variability_pctl=85,
...                                                           n_prin_comps=30)
Preprocessing...
/home/scrublet/helper_functions.py:321: RuntimeWarning: divide by zero encountered in true_divide
w.setdiag(float(target_total) / tots_use)
/home/scrublet/helper_functions.py:252: RuntimeWarning: invalid value encountered in sqrt
CV_input = np.sqrt(b);
Simulating doublets...
/home/scrublet/helper_functions.py:321: RuntimeWarning: divide by zero encountered in true_divide
w.setdiag(float(target_total) / tots_use)
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/home/scrublet/scrublet.py", line 224, in scrub_doublets
pipeline_zscore(self)
File "/home/scrublet/helper_functions.py", line 65, in pipeline_zscore
self._E_sim_norm = np.array(sparse_zscore(self._E_sim_norm, gene_means, gene_stdevs))
File "/home/scrublet/helper_functions.py", line 173, in sparse_zscore
return sparse_multiply((E - gene_mean).T, 1/gene_stdev).T
File "/home/scrublet/helper_functions.py", line 164, in sparse_multiply
return w * E
File "/home/scipy/sparse/base.py", line 518, in __mul__
result = self._mul_multivector(np.asarray(other))
File "/home/scipy/sparse/base.py", line 536, in _mul_multivector
return self.tocsr()._mul_multivector(other)
File "/home/scipy/sparse/compressed.py", line 485, in _mul_multivector
dtype=upcast_char(self.dtype.char, other.dtype.char))
MemoryError: Unable to allocate 167. GiB for an array with shape (1651, 13589760) and data type float64
>>>


The Tools was ran within a conda environment (if this makes any difference).

my data set contains Counts matrix shape: 6794880 rows, 31053 columns Number of genes in gene list: 31053

Is there a way to deal with this problem?

thanks

scrublet scRNA-seq doublet • 798 views
0
Entering edit mode

Likely not. Looks like the program wants to allocate 167 GiB of memory. Does it work with smaller datasets?

0
Entering edit mode

I would assume so, but it wouldn't help me, if I can't run it on a normal single-cell sparse matrix

I do have enough memory on m server though. This shouldn't be a problem.

0
Entering edit mode

Have you tried increasing amount of allocated memory beyond 167G + 10-20%?

scrublet also says:

When working with data from multiple samples, run Scrublet on each sample separately.

0
Entering edit mode

no not yet. As I don't have any memory restrictions and it should be able to use everything on the server. I can't understand why it is restricted to begin with.

This is only one sample