Intra-dataset (Single Dataset) Batch Correction Tips for scRNA-seq
0
0
Entering edit mode
2.8 years ago

Hello!

I'm fairly new to scRNA-seq and bioinformatics in general, and am just starting to observe the nuances that go into the analysis (in this case, batch effect correction). In investigating how to deal with batch effects, I've found lots of potential packages to deal with batch effects when integrating two or more separate datasets together into one, such as the following:

I have not found much information on how to deal with the batch effects occurring between samples/patients within ONE dataset/study, however. By what I understand, how to address this is still an open question in the field, and seems to be something you try to minimize when designing your experiment, rather than something you actively try to correct for after the fact. However, I am using publicly available datasets with no access to the raw data, and I am pretty sure there are batch effects just between the samples of the individual datasets I am using.

So my question is, without access to the raw data, what are some ways I can alleviate (and detect) batch effects within the samples of ONE dataset/study?

The closest answers I have seen are

For a TL;DR version, are there more packages (in R or python) or strategies (like maybe quality control/pre-filtering methods) out there that you guys know of that can help alleviate the batch effects present within just one dataset (no merging, no integration, just looking at ONE study and its data)?

Thank you for reading!

scrna-seq batch-effect-correction • 698 views
ADD COMMENT
0
Entering edit mode

Before going deeper, can you give more details on the data you have. You say intra-sample but does that really mean effects within the same set of cells that have been produced during the exact same 10X run as a single sample? Or are you talking about samples of the same study, that might have been produced on different days, or with different pertubations or conditions? For the latter I like fastMNN from batchelor very much. It is efficient, fast and without large memory footprint (unlike a recent paper that states the opposite), and gives decent results. Seurat and harmony do the same kind of correction. SCtransform is more a transformation method that a batch correction, in fact it is not suitable alone to integrate datasets but can be used for ranking genes by residual variance (with respect to the model it fits) which then can be used for feature selection or as a sort of "normalized count", the latter is not so super well established I think though.

ADD REPLY
0
Entering edit mode

Hello! Thank you for your answer!

When I say "intra-sample" I mean the latter - samples of the same study produced in different days/conditions/etc. I was also wondering - can you "integrate" samples within the same study (but that have different conditions and such) using Seurat or Harmony or other similar packages, or is "integration" something you do for different studies? If that makes sense?

ADD REPLY

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6