Forum: Feature Selection in 10X scRNA-seq data
1
gravatar for ATpoint
6 weeks ago by
ATpoint29k
Germany
ATpoint29k wrote:

I am looking for opinions (hands-on based experience) towards your favourit feature selection method for 10x scRNA-seq data. The motivation for this is that I recently stumbled over the GLM-PCA approach from Rafael Irizarry's lab (links see on the bottom of the post) which made me dive into the literature. As expected there are plenty of methods out there, each claiming to perform superior. Since GLM-PCA operates on raw counts it frees the uses from choosing from one of the many normalization strategies such as the ones implemented in e.g. scran, scNorm or the choices provided by Seurat, and is therefore attractive. This is admittedly not at all a precise question (therefore Forum post), and I hope to initiate some chat here about your current best practices that users inexperienced in the single-cell world (including myself) can take inspiration from.


GLM-PCA:

Preprint: https://www.biorxiv.org/content/10.1101/574574v1

Paper: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1861-6

Git: https://github.com/willtownes/scrna2019

CRAN: https://cran.r-project.org/web/packages/glmpca/index.html

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by ATpoint29k
2
gravatar for igor
6 weeks ago by
igor9.6k
United States
igor9.6k wrote:

I saw the GLM-PCA benefits. I believe that there are at least some scenarios where it does perform better. However, does it actually uncover new biological insights? Many single-cell methods make significant improvements on some metrics and look impressive on paper, but very few would actually change the conclusions that were based on classic techniques.

Personal anecdote: I tried not normalizing the data at all and expected completely nonsensical results. However, the major populations still clearly segregated.

ADD COMMENTlink written 6 weeks ago by igor9.6k

Personal anecdote: I tried not normalizing the data at all and expected completely nonsensical results. However, the major populations still clearly segregated.

That is interesting observation indeed. Have you tried it with > n=1 to see if it is widely applicable?

ADD REPLYlink written 6 weeks ago by genomax78k
1

I have not experimented much with it. I've been meaning to run a more comprehensive analysis, but more pressing tasks get in the way.

ADD REPLYlink written 6 weeks ago by igor9.6k

My expectation is that you'd see fairly significant sample-to-sample effects with zero normalization, but would be interested in seeing if that's actually true.

ADD REPLYlink written 6 weeks ago by jared.andrews074.9k

That may be true. I normally see sample-to-sample effects regardless of normalization (without some sort of batch-correction methods like CCA/MNN/etc).

ADD REPLYlink written 6 weeks ago by igor9.6k

Think it also depends on sample. Normal PBMCs are fairly consistent between samples without batch correction through standard pipelines, assuming they're done fairly close to each other by the same person. Disease samples are a different story though.

ADD REPLYlink written 6 weeks ago by jared.andrews074.9k
1

Agreed. High-quality healthy samples processed the same way tend to be fairly consistent.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by igor9.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 779 users visited in the last hour