Question

coef /makeContrasts very different results

0

Entering edit mode

12 months ago

RNAseqer ▴ 260

I have a situation where I have a factoral independant variable 'suicide' with three levels 'non_suicide', suicide, and 'unkown'. I have been setting up my analysis thus:

suicide.non.undet <- as.factor(df$suicide) CauseofDeath.recode <- as.factor(df$CauseofDeath) Sex <- as.factor(df$Sex) Smoking <- as.factor(df$Smoking) Ethanol <- as.factor(df$Ethanol) svseq1 <- as.numeric(df$svseq1)

design1 <- model.matrix(~ suicide + CauseofDeath +Sex +Smoking + Ethanol +svseq1)

Because control or 'non_suicide' is the lowest level it does not appear as a column in my design matrix. Instead, it should be the intercept. 'Suicide' cases appear as the second column in my design matrix. So I have been performing my regression using coef=2:

dge.counts <- DGEList(counts=counts )
y1 <- calcNormFactors(dge.counts, method="TMM")
y1 <- estimateDisp(y1, design1)


fit.1 <- glmQLFit(y1,  design1, robust=TRUE)
result.1 <- glmQLFTest(fit.1, coef=2)

The results I get from this are reasonable, looking at the qqplot of pvalues.

However, because i have three levels in my independant variable, and I like to cut, paste, and re-work code from one project to the next as needed, I decided it would be safer to specify the exact contrast I was performing using the makeContrasts method, to prevent future me from accidentally running the wrong contrast if the order of the levels of a future project didn't set the control as the intercept. So i did the following:

contrast2 <- makeContrasts(suicide - intercept , levels=design1)
fit.2 <- glmQLFit(y1,  design.1, robust=TRUE)
result2 <- glmQLFTest(fit.2, contrasts=contrast2)

Doing things this way I got a drastically different result with a very different pvalue qqplot (pvalue histogram shows a massive shift to 'significant' pvalues).

My question is this, I know I am doing something wrong here, but what? I got the same result when I tried setting up my design matrix:

design2 <- model.matrix(~ 0 + suicide + CauseofDeath+Sex +Smoking + Ethanol +svseq1)

and specifying the contrasts as

contrast3 <- makeContrasts(non_suicide - suicide , levels=design2)

Shouldnt the coef=2 have been essentially the same as contrasting suicide to the intercept?

makeContrasts coef2 edgeR • 809 views

ADD COMMENT • link updated 11 months ago by Gordon Smyth ★ 7.0k • written 12 months ago by RNAseqer ▴ 260

score 2 · Answer 1 · 2023-05-16

No, testing coef=2 is very different from contrasting suicide to intercept. It's actually never correct to include the intercept in a contrast definition. The intercept caries information about the baseline expression level of the gene, not about differences between patient groups.

The use of contrasts and intercepts is discussed at some length in the limma User's Guide. All the discussion there is also relevant to edgeR.