SCRAN-normalization
0
0
Entering edit mode
5 months ago

Hello, I am interested in performing differential-expression analysis between cell-types on the following dataset:

https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-5061/

My plan was to use the scran package for normalization of raw counts. One step in the pipeline is to pool samples by the "quickCluster"-function in order to obtain cell-types. However, as I said, I already have the celltype information from the authors. If I compare the annotated cell-types to the clusters obtained by "quick-cluster" I see the same amount of clusters, but the cluster-composition is totally different.

How would you proceed in this case?

Best, Andreas

scRNA Normalization • 257 views
0
Entering edit mode

You do not have to do this step via quickCluster, you can also (at least technically) provide cluster information you already have. From OSCA

We use a pre-clustering step with quickCluster() where cells in each cluster are normalized separately and the size factors are rescaled to be comparable across clusters. This avoids the assumption that most genes are non-DE across the entire population - only a non-DE majority is required between pairs of clusters, which is a weaker assumption for highly heterogeneous populations. By default, quickCluster() will use an approximate algorithm for PCA based on methods from the irlba package. The approximation relies on stochastic initialization so we need to set the random seed (via set.seed()) for reproducibility.

I think this step is rather to avoid problems with sparsity than anything else, so using a priori cluster information should be possible I guess. If you need the opinion of the developer you can post this at support.bioconductor.org using scran as a tag, then Aaron Lun might answer your question.