My RNA-seq data has been sequenced in 5 batches. Upon applying MW stats, I found significance for primary aligned read counts between batches. Since the data is for severity, I compared mild of batch 1 with mild of another batch and likewise for mod and severe categories to assure that the significance between batches (for primary aligned reads for all mild, mod and sev) is coming because of only technical variability as I can't risk of losing actual biological impact between any of those groups.
I found significance in primary aligned read count when I compared mild of batch 1 with another batch and likewise but this wasn't consistent for all.
Considering this, I need to remove the batch effect. since I am at downstream considering coverage breadth (as the question is "how much of the gene body is covered by my seq reads?"), ... for this no tool is available. I read about DSeq2, using limma and other packages in back end for same, so thought to apply that.
Can anyone please help me understand how limma or other more suitable tool performs this batch correction (data normalisation), and how can I do this?
Thanks in advance!