design matrix considering treatment, timepoints and batches
1
0
Entering edit mode
3.0 years ago
Iván ▴ 60

Hi,

I'm analysing a microarray with 59 samples distributed across the following groups and timepoints:

SampleName  Group   Batch   Timepoint
sample1 control_30  1   30
sample2 control_30  2   30
sample3 control_30  2   30
sample4 control_30  3   30
sample5 control_30  4   30
sample6 control_30  4   30
sample7 control_30  4   30
sample8 control_30  4   30
sample9 treat1_30   1   30
sample10    treat1_30   1   30
sample11    treat1_30   2   30
sample12    treat1_30   2   30
sample13    treat2_30   1   30
sample14    treat2_30   1   30
sample15    treat2_30   2   30
sample16    treat2_30   2   30
sample17    treat2_30   3   30
sample18    treat3_30   1   30
sample19    treat3_30   1   30
sample20    treat3_30   2   30
sample21    treat3_30   2   30
sample22    control_60  5   60
sample23    control_60  5   60
sample24    control_60  6   60
sample25    control_60  6   60
sample26    treat1_60   5   60
sample27    treat1_60   5   60
sample28    treat1_60   6   60
sample29    treat1_60   6   60
sample30    treat2_60   5   60
sample31    treat2_60   5   60
sample32    treat2_60   6   60
sample33    treat2_60   6   60
sample34    treat2_60   3   60
sample35    treat3_60   5   60
sample36    treat3_60   5   60
sample37    treat3_60   6   60
sample38    treat3_60   6   60
sample39    control_90  7   90
sample40    control_90  7   90
sample41    control_90  8   90
sample42    control_90  8   90
sample43    treat1_90   3   90
sample44    treat1_90   3   90
sample45    treat1_90   3   90
sample46    treat1_90   3   90
sample47    treat2_90   7   90
sample48    treat2_90   7   90
sample49    treat2_90   8   90
sample50    treat2_90   8   90
sample51    treat3_90   7   90
sample52    treat3_90   7   90
sample53    treat3_90   8   90
sample54    treat3_90   8   90
sample55    treat3_90   3   90
sample56    treat4_90   7   90
sample57    treat4_90   7   90
sample58    treat4_90   8   90
sample59    treat4_90   8   90

We're interested in evaluating the differences between each treatment VS. control within each timepoint, so each treatN in timepoint N is compared against the respective control_N.

My current code and design is:

# Design matrix
batch <- factor(y$targets$Batch)
group <- factor(y$targets$Group)
design <- model.matrix(~0 + batch + group)

colnames(design) <- gsub("group", "", colnames(design))
colnames(design) <- gsub("batch ", "", colnames(design))

contr.matrix <- makeContrasts(
  # timepoint 30
  Cont.VS.Treat1.30dpi = treat1_30-control_30,
  Cont.VS.Treat2.30dpi = treat2_30-control_30,
  Cont.VS.Treat3.30dpi = treat3_30-control_30,
  # timepoint 60
 Cont.VS.Treat1.60dpi = treat1_60-control_60,
  Cont.VS.Treat2.60dpi = treat2_60-control_60,
  Cont.VS.Treat3.60dpi = treat3_60-control_60,
  # timepoint 90
 Cont.VS.Treat1.90dpi = treat1_90-control_90,
  Cont.VS.Treat2.90dpi = treat2_90-control_90,
  Cont.VS.Treat3.90dpi = treat3_90-control_90,
  levels = colnames(design))

fit <- lmFit(y, design)
fit2 <- contrasts.fit(fit, contr.matrix)
fit2 <- eBayes(fit2,trend=TRUE,robust=TRUE)

Then, to extract the DEGs between treat1 and control in timepoint 30 while accounting for batch effects I get the topTable from the relevant coefficient, as follows:

topTable(fit2, coef = 1, adjust.method = "BH", n=5, sort.by = "P")

While for treat2 and control in timepoint 30 would be:

topTable(fit2, coef = 2, adjust.method = "BH", n=5, sort.by = "P")

etc.

I would like to assert whether this is the correct way to construct my design matrix and contrasts as I'm getting very different number of DEGs from a previous analysis performed by someone else in our group with the same dataset. Help will be greatly appreciated.

thank you!

Ivàn

limma design microarray experimental • 837 views
ADD COMMENT
0
Entering edit mode

Anyone willing to input on this? Thank you! :-)

ADD REPLY
0
Entering edit mode
3.0 years ago

I don't see anything of major concern; however, one thing that may result in extra bias is that your controls are sometimes being used from batches not also used by treatment. For example:

treat1_30 (batch 1 and 2) vs control_30 (batch 1, 2,3, and 4)

To be frank, you may consider first removing the batch effect from the data via limma::removeBatchEffect(), or use batch as a blocking factor (special functionality supplied by limma).

Just some extra things to try.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6