Adding counts from two barcodes in anndata object
Entering edit mode
2.2 years ago

Hello, I have a python anndata object containing gene counts for a number of barcodes. Because of some technical details of the single cell technology it originates from, each cell receives two barcodes instead of one. Therefore, in order to recapitulate the gene counts for a cell, I need to add up the counts from two barcodes.

I am using the following code:

adata.X = adata.X.tolil()

for first_index, second_index in zip(first_list_indices, second_list_indices):
    adata.X[first_index,:] += adata.X[second_index,:]

adata.X = adata.X.tocsr()

With first_list_indices and second_list_indices being a list of the indices of the barcodes that need to be added up. In a subsequent step, I remove the barcodes corresponding to the second_list_indices

to_keep = [True] * len(adata.obs.index)
for second_ind in second_barcode_indices:
    to_keep[second_ind] = False

# Remove second barcodes
adata = adata[to_keep,:].copy()

However, my computer runs out of memory when I try to run this code for a big matrix. I am sure there is a much better and efficient way to do this, and I would really appreciate if someone could help me optimise the code to make it use less memory.

Thanks so much!!

scRNA-seq anndata • 605 views

Login before adding your answer.

Traffic: 1751 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6