Question: single cell RNA seq batch correction question
3 months ago by
I am a newbie in computing biological data and I have 30 samples ranging from 3-8GB size obtained when studying embryonic development of cells in small and large groups.I was doing batch correction as first step of my pipeline and was struck with this step. I am planning to do kBET and MNNcorrect batch correction methods and would like to know how I can generate the matrix [rows -cells, columns -genes]. I am assuming that I will copy all the file names to rows - which would be my cells; and copy genes as columns. Then I was confused as to what genes to add as columns.

  1. can you please confirm if my understanding is right?
  2. Please suggest what I choose as genes and where can I obtain the gene list?
  3. Now I am starting to wonder if my pipeline is correct. Please suggest if anyone came across a pipeline which involves batch correction, before QC or alignment to reference?

What format is your data in? What platform/chemistry was used?

Batch correction isn't done during QC or before alignment, but after counts are generated and prior to dimensionality reduction/clustering.

ADD REPLYlink modified 3 months ago • written 3 months ago by jared.andrews075.5k

Thank you so much jared.andrews07!!

The data is in fastq format and obtained from SMARTseq platform.

As per your suggestion I will be doing QC-> alignment -> batch correction -> differential expression analysis -> dimensionality reduction/clustering

Can you please confirm if the above sequence of steps is possible? If I want to add an optional quantification step can I do it after alignment and before batch correction?

ADD REPLYlink written 3 months ago by getanid1230

Yes, that looks pretty typical to me. Some post alignment QC will be necessary to remove doublets, dead cells, etc, but that is pretty easy to do and part of any typical workflow. And yes, that is where quantification would be done.

I am going to recommend you take a look through this free online book from Bioconductor, as it's an invaluable resource created by experts in the field (many of whom wrote some of the software you will use).

ADD REPLYlink written 3 months ago by jared.andrews075.5k

Thank you so much!!

ADD REPLYlink written 3 months ago by getanid1230
