Limma experiment design and making contrasts
1
0
Entering edit mode
27 days ago
kra277 • 0

Hi,

I am a novice working on a 450k methylation array analysis. I have a very simple design which is to see the differentially methylated genes b/w smoking (1) vs non-smoking (0). This is the following I did.

# using smoking_primary as the factor in interest
design <- model.matrix(~0 + smoking_primary)

# Make contrasts 0 is the control and 1 is the test
contrast <- makeContrasts(smoking_primary0 - smoking_primary1,
levels = design)

# fit to methyaltion set
fit <- lmFit(m_norm_qc, design)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)

## Add the annotations to the results
ann450kSub <- ann450k[match(rownames(m_norm_qc),ann450k\$Name),
c(1:4,12:19,24:ncol(ann450k))]

DMPs <- topTable(fit2, num=Inf, coef=1, genelist = ann450kSub)


Could you please review this and tell me if it is the correct way to do the analysis?

In addition, how should I approach adding covariates to my design? If you could point me to the resource where I could get more info that would be very helpful. I checked the limma manual but it seems a little confusing for a simple design like mine.

Thank you for your time on this post.

limma methylation 450k • 499 views
0
Entering edit mode
0
Entering edit mode
27 days ago

It seems generally okay. For the contrast, you may want to instead use:

contrast <- makeContrasts(
smoking = smoking_primary1 - smoking_primary0,
levels = design)


That is, we assign a name, smoking, to the contrast, and we make 1 the numerator and 0 the denominator (for fold change derivation).

Later when you run topTable(), I am of the belief that it is 'safer' to refer to coefficients by name; so, you'd use:

DMPs <- topTable(fit2, num = Inf, coef = 'smoking', genelist = ann450kSub)


## ֎֎֎֎֎֎֎֎֎֎֎֎

With regard to covariates, these are added when you create the design:

design <- model.matrix(~0 + smoking_primary + BMI + sex + income)


Then, to adjust for these, you simply derive test statistics for smoking_primary as you did previously. The inner workings of limma will do the remainder (the adjustment(s) for covariates) for you.

Kevin

0
Entering edit mode

That is very insightful. Thank you very much for the answer. Also, if I may ask, could you please point me to the articles for understanding the usage of design and contrasts?

0
Entering edit mode

Hi, these follow the same principles as formulae used in regression modelling, so, you may want to focus on that (when searching). What limma is doing is running independent models of the form:

gene1 ~ 0 + smoking_primary
gene2 ~ 0 + smoking_primary
gene3 ~ 0 + smoking_primary
et cetera


That is, it's a linear regression.

0
Entering edit mode

Thank you very much for this. Much appreciated.