Question

Batch Effect

0

Entering edit mode

2 days ago

Umair • 0

Is it really CONSIDERED a batch effect when I extracted information from FASTQ file, reading first lines of it, and got the run_number and flowcell_ID from those lines? Or I am unintentionally reading too much by extracting such information for my RNA-sequence dataset which actually is not a batch effect?

batch effect • 357 views

ADD COMMENT • link 1 day ago by Umair • 0

score 1 · Answer 1 · 2025-11-06

1

Entering edit mode

2 days ago

Kevin Blighe 89k

Hey,

It is not 'over-interpreting' - the information that you have extracted can indeed be used to identify potential batches. In RNA-seq, the sequencing run and flowcell are well known sources of technical variation / batch effects, and, for this reason, are sometimes explicitly included in the statistical model. The flowcell ID, in particular, can be important.

To check if these are actually driving a batch effect in your data, I would advise to generate a PCA bi-plot (or heatmap) of your normalised counts and colour the samples by flowcell / run. If you see a clear separation, then, yes, there is a batch effect. In that case, you can use this information as a covariate in your model, e.g., in DESeq2's design formula.

Kevin

ADD COMMENT • link 2 days ago by Kevin Blighe 89k

0

Entering edit mode

Thank you Kevin for your reply. Kindly, can you advise me after checking the PCAs of my dataset?

enter image description here

ADD REPLY • link 2 days ago by Umair • 0

1

Entering edit mode

I would not consider the flow cell meaningfully and clearly impacting things based on your non-batch corrected plots. I'd leave it out of your model design personally. Note that checking additional PCs can also be helpful (PC3/4), but it can quickly become a ghost hunt. True batch effects are typically pretty obvious.

ADD REPLY • link 2 days ago by jared.andrews07 ★ 19k

0

Entering edit mode

Thank you for the reply. What do you suggest me that should I keep all 3 replicates for each treatment or drop some of them? as from PCA it appears that atleast two replicates for most of my treatments cluster closer. My PCA confuses me.

ADD REPLY • link 1 day ago by Umair • 0