Question

How to model a pre and post treatment in limma while adjusting for control

0

Entering edit mode

3.6 years ago

AST • 0

I have RNA-seq data from the GEO database that has three groups - Control (C), Disease1 (D1), and Disease2 (D2). The patients of D1 and D2 were treated with a drug (T) and data was generated on pre and post-treatment disease groups i.e. I have 5 grouping factors - C, D1-pre, D1-post, D2-pre, D2-post. I am a bit confused while modeling this nested data to identify DE genes in limma. I have two questions in mind:

What are DE genes in D1 and D2 compared to control, and
Whether the drug has a differential effect on the expression of DE genes in D1 compared to D2.

I tried to make a model matrix as:

library(limma)
patient <- data.frame(
    patient$patient <- factor(paste(rep(1:5, times = 5),"_",patient$diagnosis))
    diagnosis = factor(rep(c("D1", "D2","Control"), each=10, length.out=25)),
    treat = factor(rep(c("Pre", "Post"), each=5, times=3, length.out=25)))
patient$group <- factor(paste(patient$diagnosis,patient$treat))
design <- model.matrix(~0+group,patient)
colnames(design) <- c("C", "D1post", "D1pre", "D2post", "D2pre")

Since I am not looking for a pair-wise comparison, I am a bit confused with the modeling. I am thinking in lines, something like:

des1 <- design[,c(1:3)]
corfit <- duplicateCorrelation(eset1,des1,block=sheet1$patient) 
#eset1 is the subsetted data matrix with expression levels
fit <- lmFit(eset1,des1,block=sheet1$patient,correlation=corfit$consensus) 
fit.e <- eBayes(fit, trend=TRUE, robust=TRUE)
topTable(fit, coef=2)

and a similar model for D2.

I am not sure how correct these models will be to identify 'true' DEGs in post-treatment groups for both diseases.

I will be thankful if someone could please help me with modeling.

R anova regression limma • 538 views

ADD COMMENT • link 3.6 years ago by AST • 0