glmQLFTest coef for multiple comparisons
2
1
Entering edit mode
3.5 years ago
silas008 ▴ 160

Hy, guys

I checked the edegR manual and the explanation for coef is not clear for me. For 2 or 3 groups it's very simple to determine the coef for comparisons (coef:2 for 2vs1 and coef=3 for 3vs1). But what coef should I use if I want to compare many groups like below:

group <- factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5,6,6,6,6,6,7,7,7,7,7))
y <- DGEList(counts=data,group=group)
y
norm_data <- calcNormFactors(y)
design <- model.matrix(~group)
norm_data <- estimateDisp(norm_data,design)
fit <- glmQLFit(norm_data, design)
qlf2vs1 <- glmQLFTest(fit, coef=2)

Is there a rule for that? For example: how to compare groups 7 vs 1 or 5 vs 2... etc?

Thanks in advance

edgeR RNA-Seq • 3.4k views
ADD COMMENT
3
Entering edit mode
3.5 years ago
Gordon Smyth ★ 7.0k

You can keep the design matrix as it is now. To compare 5 vs 2 for example, use

qlf5vs2 <- glmQLFTest(fit, contrast=c(0,-1,0,0,1,0,0))

Why does this work? As you have correctly understood, the coefficients are all relative to level 1. So coef5=group5-group1 and coef2=group2-group1. So coef5-coef2=group5-group2.

ADD COMMENT
2
Entering edit mode
3.5 years ago
ATpoint 81k

I simply would use a design ~0+group and then use contrasts (makeContrasts) as explained in the manual to make any comparison you need.

ADD COMMENT
0
Entering edit mode

It looks easier.

For 6vc1 is it something like this, right?:

design <- model.matrix(~0+group)
norm_data <- estimateDisp(norm_data, design)
fit <- glmQLFit(norm_data, design)
my.contrasts <- makeContrasts(2vs1=2-1, 6vs1=6-1, levels=design)
> qlf.6vs1 <- glmQLFTest(fit, contrast=my.contrasts[,"6vs1"])

One simple question: why in this case should I use ~0+groups instead of ~groups?

Thanks again

ADD REPLY
1
Entering edit mode

Looks about right, I would make the name like G6 instead of 6 though, also the contrasts like G6vsG1 rather than 6vs1, be safe and avoid numbers as variable names.

ADD REPLY
0
Entering edit mode

It worked very well.

I think you didn't see my last question. "One simple question: why in this case should I use ~0+groups instead of ~groups?"

Thanks again

ADD REPLY
1
Entering edit mode

The design without intercept (the one with the zero) does not use the first sample as the reference, therefore you can make all possible combinations via the contrast function.

ADD REPLY
0
Entering edit mode

Actually, it's not properly working.

When I use the classical aproach it works fine. The results are ok, logFC as expected and some differentially expressed genes:

design <- model.matrix(~group)
norm_data <- estimateDisp(norm_data, design)
fit <- glmQLFit(norm_data, design)
qlf_2vs1 <- glmQLFTest(fit, coef=2)
topTags(qlf_2vs1)

With the new code it seems the comparison is not correct. logFC is like -300 for most of genes and almost all of them are differentially expressed:

design <- model.matrix(~0+group)
norm_data <- estimateDisp(norm_data, design)
fit <- glmQLFit(norm_data, design)
my.contrasts <- makeContrasts(G2vsG1=2-1, G3vsG1=3-1, G4vsG1=4-1, G5vsG1=5-1, G6vsG1=6-1, G7vsG1=7-1, levels=design)
qlf2vs1 <- glmQLFTest(fit, contrast=my.contrasts[,"G2vsG1"])
summary(qlf2vs1)
topTags(qlf2vs1)

I can't see any error in the code but it's obviously wrong in some point.

Do you know what is the probable error?

ADD REPLY
1
Entering edit mode

Probably because you use integers as variables which is bad practice in R. Something like G2vsG1=G2-G1 would be better than 2-1. Not sure how the function interpretes integers. In any case it is bad practice to use integers as colnames or variables.

ADD REPLY
0
Entering edit mode

Perfect. Now is everything working well.

Thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6