Differential Expression Analysis with batches which have different biological groups
1
0
Entering edit mode
4.9 years ago
asalimih ▴ 60

Hello All,

I have three batches of data which are from 3 different datasets.

• batch1: Including Just TypeA Samples
• batch2: including Just TypeB Samples
• batch3: including TypeB and TypeC Samples

how can I perform a differential expression analysis between each two of TypeA TypeB TypeC groups using limma package while removing batch effect. (the batch effect is obvious in plotMDS plot.)

I've read this post (https://support.bioconductor.org/p/69328/) which explains how to consider batches while using limma package but I think my case is different in the way that for example TypeA samples are just in one batch not other batches.

How can this be done using limma package? Is there any better packages and approaches?

Edit:

After reading comments saying that's impossible to remove batch effect when one biological group is just in one batch I change my question for another scenario #2:

• batch1: TypeA and TypeC Samples
• batch2: TypeB and TypeC Samples
• batch3: Just TypeC Samples
R RNA-Seq limma DE batch-effect • 1.9k views
2
Entering edit mode

How can you distinguish between batch effect noise and the biological variability you're interested in for batches 1 and 2?

0
Entering edit mode

That's the main problem . is there any package or method for this issue?

0
Entering edit mode

Since batch1 is only TypeA and there's no TypeA in any other batches, then it's theoretically impossible to distinguish differences that are true biological differences in TypeA from differences caused by a batch effect.

0
Entering edit mode

0
Entering edit mode
4.9 years ago
asalimih ▴ 60

After reading several posts I provided an example for scenario #2 using limma package. But still I'm not sure if this approach is correct.

#just an example for scenario #2
gr = factor(rep(c("TypeA","TypeB","TypeC"),each=3))
batches = factor(1,1,1,1,2,2,2,1,3)

dg <- DGEList(data)
dg <- calcNormFactors(dg)
dg$samples$group <- gr;
design <- model.matrix(~0+gr+batches)
colnames(design) <- gsub("gr", "", colnames(design))
colnames(design) <- gsub("batches", "", colnames(design))

contr.matrix <- makeContrasts(
TypeA.TypeB = TypeB-TypeA,
TypeB.TypeC = TypeC-TypeB,
TypeA.TypeC = TypeC-TypeA,
levels = design)

dg.v <- voom(dg, design)
dg.v.fit <- lmFit(dg.v, design)
dg.v.fit <- contrasts.fit(dg.v.fit, contrasts=contr.matrix)
dg.e.fit <- eBayes(dg.v.fit)

topAB <- topTable(dg.e.fit,coef=1,number=Inf)
topBC <- topTable(dg.e.fit,coef=2,number=Inf)
topAC <- topTable(dg.e.fit,coef=3,number=Inf)