Question

scanpy - subset AnnData to remove samples with low numbers of cells

0

Entering edit mode

17 months ago

ES • 0

Hi,

I am using scanpy for scRNA-seq analysis. Please could someone help with how to subset an AnnData object to remove samples with a number of cells below a threshold? Could I use an extension of the value_counts() function?

e.g. adata.obs['Sample'].value_counts() < 500

Thank you!

scanpy python scRNAseq • 3.0k views

ADD COMMENT • link 17 months ago by ES • 0

0

Entering edit mode

You can filter your cells and genes count with sc.pp.filter_cells(adata, min_genes=200) and sc.pp.filter_genes(adata, min_cells=3), respectively.

ADD REPLY • link 17 months ago by zorbax ▴ 610

0

Entering edit mode

Thank you for your reply, I do use the filters you mentioned. But what I'd like to do is remove a whole sample based on the total number of cells in that sample, rather than just filtering genes and cells (not from a specific sample).

ADD REPLY • link 17 months ago by ES • 0

score 1 · Accepted Answer · 2022-11-11

1

Entering edit mode

17 months ago

zorbax ▴ 610

You need to keep only the value counts higher than your threshold value:

cluster_counts = adata.obs['Sample'].value_counts() 
keep = cluster_counts.index[cluster_counts >= 500] 
filtered_adata = adata[adata.obs['Sample'].isin(keep)].copy()