Question: Applying batch correction to single-cell RNA-seq in diffferent time points
gravatar for phenomata
3 months ago by
phenomata0 wrote:

Hi, I'm pretty new to the scRNA-seq world and while working on my own sets of data, I'm starting to wonder when should the batch correction algorithm be used appropriately.

Let's say we have a Day0, a Day1, a Day2 and a Day3 scRNA-seq sample.

To elaborate, starting from Day0, assume we treated certain chemical, and sampled it on a daily basis during the course of experiment.

Would it be ok or reasonable to apply batch correction algorithm (e.g. CCA) to this aggregation of samples? I mean, is CCA algorithm designed for this kind of experiment design?

From the experiment from Kang et al., 2017 which is comprised of PBMC, splitted into a control group and a stimulated group treated with interferon beta, they state that "the repsonse to interferon caused cell type specific gene expression changes that makes a joint analysis of all the data difficult with cells clustering both by stimulation condition and by cell type". But is it reasonable?

My understanding is that if you are to use batch correction you should have biological or technical batches from the "same condition". So if you have replicate samples with the same condition and when somehow they are separated from each other for technical reason, it's appropriate to use batch correction.

Going back to the supposed experiment I stated above, I think (maybe i'm wrong and i am most of time) it's not reasonable to apply batch correction to this Day0-4 experiment.

Can someone give me some clear explanation to the use of batch correction?

Thank you. Ryan

ADD COMMENTlink modified 3 months ago by igor11k • written 3 months ago by phenomata0

With the Seurat integration workflow, they "force" cells that are probably the same cell type to cluster closer together in dimension reduction by tweaking the count values of certain genes. This is why, for example, the integration workflow can have cells clustering together from different technologies such as scRNA-seq and scATAC-seq. This is a little more complicated than batch correction by adding batch as a covariate to a regression model like you would see when doing differential expression with DESeq2 or edgeR as an example. With your time course experiment, the integration workflow would likely cause the closer clustering of similar cell types, despite any transcriptional changes during the time course, so I wouldn't necessarily discount this as an option.

ADD REPLYlink modified 3 months ago • written 3 months ago by rpolicastro2.4k

Thanks for the reply. I'll definitely consider CCA as an option.

ADD REPLYlink written 3 months ago by phenomata0
gravatar for igor
3 months ago by
United States
igor11k wrote:

if you are to use batch correction you should have biological or technical batches from the "same condition"

From the integration vignette: "These methods aim to identify shared cell states that are present across different datasets, even if they were collected from different individuals, experimental conditions, technologies, or even species"

Thus, the "official" answer is that different conditions are fine.

Really, it depends on the questions you want to ask and on the data that you have. For example, if all your time points segregate and form distinct clusters, it's going to be hard to present any kind of coherent analysis.

ADD COMMENTlink written 3 months ago by igor11k

Thank you for the answer igor. I guess i didn't carefully pay attention to their vignette.

ADD REPLYlink written 3 months ago by phenomata0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2204 users visited in the last hour