Question

Regarding the usage of raw vs filtered bc matrices

0

Entering edit mode

7 weeks ago

biotrekker ▴ 110

Hello, I am working with Alzheimer's single nucleus rna-seq data. The data from post-mortem human brain tissue. Clearly Alzheimer's pathophysiology affects cell collection quality.

I was given FASTQ files which I processed with the standard 10X cell ranger pipeline. I ended up getting raw bc matrices and filtered bc matrices. I am not getting a strong neuronal signal when using the filtered bc matrices. Should I be using the raw bc matrices and then adding my own filters for better neuronal identification during clustering?

I'm not sure why I am not seeing any real neuronal signal. I am seeing a mixed array of immune cells but not neuron signatures and definitely not layer specific markers in my highly variable genes.

Is it appropriate to start with raw bc matrices, I suspect cell ranger might be getting rid of neurons in its filtering process?

Thanks,

10X ranger cell snRNAseq • 12k views

ADD COMMENT • link updated 7 weeks ago by dsull ★ 7.8k • written 7 weeks ago by biotrekker ▴ 110

1

Entering edit mode

Cellranger is usually quite harsh on cell selection and easily discard non optimal cells, but nuclei should be more resilient to bursting. How many nuclei are present in your raw compare to your filtered ?

Not finding neuronal genes expressed in a brain tissue is quite disturbing. Do you see any expression plotting neuronal gene markers on your UMAP/tSNE ?

Do you have the whole brain sequenced or a specific area ?

ADD REPLY • link 7 weeks ago by Bastien Hervé 6.5k

0

Entering edit mode

Its on a sample by sample basis, but raw has ~2 million, but cell ranger filtered h5 files have ~7k cells. None of key neuronal markers make it past HVG (top 5000), since they are not variable. None of clusters have a clear neuronal signal. The brain region of interest is the MTG.

The filters i use are min_genes=100 and min_counts=900 for sc.pp.filter_cells in scanpy, mt_thresh=5, ribo_thresh=25, hb_thresh=1

Should I change them?

ADD REPLY • link 7 weeks ago by biotrekker ▴ 110

score 1 · Answer 1 · 2025-09-12

1

Entering edit mode

7 weeks ago

dsull ★ 7.8k

Yes, start with the raw matrices and apply your own filtering — that’s what I always do. (And it will help you QC).

ADD COMMENT • link 7 weeks ago by dsull ★ 7.8k

1

Entering edit mode

Interesting, I never had the need to actually re-do the filtering from CellRanger.

Keep in mind @OP that this sort of filtering is entirely technical and has nothing to do with biology. It merely aimes to decide which droplets correspond to truely captured cells with good quality, and which droplets either captured damaged cells or just be empty, aka captured ambient RNA. I would first of all make sure that your actual analysis is fine. Review experimental protocols, see other similar datasets for what you can expect and see whether the results you see from the filtered bc could be normal and expected. Not saying that CellRanger is perfect by any means, but it takes a lot to filter "that poor" so an entire garniture of cells (here neuronal ones) just accidentally get filtered away. Again, could be the case. After all, you can just load the raw matrix, retain all cells with say 500 genes and a minimum certain depth, and then color a UMAP by canonical neuronal markers that MUST be present in the dataset. That will quickly tell whether filtering was removing entire entities.

ADD REPLY • link 7 weeks ago by ATpoint 89k

1

Entering edit mode

Yeah, it’s worth trying CellRanger versus different types of custom filtering (and possibly a few different settings for the downstream steps). I suspect that OP will not identify the desired biological signal no matter how hard they try — essentially “proving a negative”. (If a signal is detected after some customization, then they need to figure out why their original settings didn’t produce that signal).

ADD REPLY • link 7 weeks ago by dsull ★ 7.8k

0

Entering edit mode

Its on a sample by sample basis, but raw has ~2 million, but cell ranger filtered h5 files have ~7k cells. None of key neuronal markers make it past HVG (top 5000), since they are not variable. None of clusters have a clear neuronal signal. The brain region of interest is the MTG.

The filters i use are min_genes=100 and min_counts=900 for sc.pp.filter_cells in scanpy, mt_thresh=5, ribo_thresh=25, hb_thresh=1

Should I change them?

ADD REPLY • link 7 weeks ago by biotrekker ▴ 110

1

Entering edit mode

but raw has ~2 million, but cell ranger filtered h5 files have ~7k cells

Yes, that sounds normal since most droplets are always empty.

None of key neuronal markers make it past HVG (top 5000), since they are not variable. None of clusters have a clear neuronal signal. The brain region of interest is the MTG.

Any enrichment during wetlab done? Not a brain person, but my sanity check would be presence of neurons, astrocytes, microglia etc. Just making a couple of UMAPs, colored by clusters and key markers, and then seeing whether markers largely colocate with one or several clusters. That's fast and gives a crude idea whether things are normal.

ADD REPLY • link 7 weeks ago by ATpoint 89k

0

Entering edit mode

This is single nuclear data, however I didn't map the introns during cell ranger, could that be the issue?

ADD REPLY • link 7 weeks ago by biotrekker ▴ 110

2

Entering edit mode

Yes, you need to map the introns lol; that’s literally like 70% of your data for nuclei preps.

ADD REPLY • link 7 weeks ago by dsull ★ 7.8k

0

Entering edit mode

Thanks, I realized I messed up when I was looking at single cell vs single nucleus procesing with 10X. I hope to get a neuronal signal. I was getting almost nothing before and I tried multiple normalization methods and what not. Any filtering standards and methodology you recommend?

Thanks

ADD REPLY • link 7 weeks ago by biotrekker ▴ 110

1

Entering edit mode

Well, I don't have any particular recommendations; that's for you the play around with. You can start with the CellRanger filtered matrix first if you'd like (as ATpoint mentioned above, it has always worked well for him).

ADD REPLY • link 7 weeks ago by dsull ★ 7.8k