Normalization for subset of data
0
0
Entering edit mode
3.4 years ago
asumani ▴ 70

Hi all,

I need to analyze a subset of publicly available data. There are multiple antibody isotypes of B cells in a single cell RNA seq data. I want to subset IgEs(test) and IgMs(control) for differential expression analysis. Now, should I do the normalization before or after subsetting? Does it even matter? Finally, if it matters how does it affect statistical analysis?

Best,

statistics scRNAseq • 1.2k views
ADD COMMENT
0
Entering edit mode

Difficult to answer without more context. Is tis a single experiment or pulled from different sources? Probably you should create a single count matrix for the relevant celltypes and then feed this into an appropriate statistical framework. That would mean normalize after subsetting. It matters for sure, especially when the composition and type of cells are very different in the full experiment.

ADD REPLY
0
Entering edit mode

It is a single experiment. The normalized count matrix from the same experiment is already available. My plan is to subset from this existing count matrix.

Second, I can run separate pipeline for the subset of fastq files and obtain another count matrix. Normalize the subset and do further analysis.

I am confused if subsetting from already normalized matrix would be statistically acceptable. Or, should I preprocess raw data for the subset and then normalize?

ADD REPLY
0
Entering edit mode

Subsetting the existing one is probably ok but then you are limited to statisticql tests then directly use the norm. counts such as the Wilcox test. For finding markers that is probably ok.

ADD REPLY

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6