scanpy - subset AnnData to remove samples with low numbers of cells
1
0
Entering edit mode
17 months ago
ES • 0

Hi,

I am using scanpy for scRNA-seq analysis. Please could someone help with how to subset an AnnData object to remove samples with a number of cells below a threshold? Could I use an extension of the value_counts() function?

e.g. adata.obs['Sample'].value_counts() < 500

Thank you!

scanpy python scRNAseq • 3.0k views
ADD COMMENT
0
Entering edit mode

You can filter your cells and genes count with sc.pp.filter_cells(adata, min_genes=200) and sc.pp.filter_genes(adata, min_cells=3), respectively.

ADD REPLY
0
Entering edit mode

Thank you for your reply, I do use the filters you mentioned. But what I'd like to do is remove a whole sample based on the total number of cells in that sample, rather than just filtering genes and cells (not from a specific sample).

ADD REPLY
1
Entering edit mode
17 months ago
zorbax ▴ 610

You need to keep only the value counts higher than your threshold value:

cluster_counts = adata.obs['Sample'].value_counts() 
keep = cluster_counts.index[cluster_counts >= 500] 
filtered_adata = adata[adata.obs['Sample'].isin(keep)].copy()
ADD COMMENT
0
Entering edit mode

Great, thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6