Which model to choose in edgeR analysis?
1
0
Entering edit mode
7.8 years ago
biolab ★ 1.4k

Dear all,

Our RNA-seq data has two samples, each of which has three biological replicates. I have two edgeR scripts as follows. I found that they produced different results. My question is: which model (classical linear or GLM) to choose in edgeR analysis?

I wish biostars experts on R analysis could provide me some comments or suggestions. Thank you very much .

- R script 1

d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- calcNormFactors(dge)
design <- model.matrix(~g)
dge <- estimateDisp(dge, design)
fit <- glmQLFit(dge, design)
qlf <- glmQLFTest(fit, coef=2)
topTags(qlf, n=100000)

R script 2

d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
et <- exactTest(dge)
topTags(et, n=100000)
edgeR • 8.8k views
ADD COMMENT
1
Entering edit mode
7.8 years ago
Benn 8.3k

This was once asked on the bioconductor support site:

https://support.bioconductor.org/p/64807/

ADD COMMENT
1
Entering edit mode

And the difference between glmFit and glmQLFit is described here:

https://support.bioconductor.org/p/76790/

ADD REPLY
0
Entering edit mode

Hi, b.nota,

Thank you very much for your reply.

I just performed a variety of tests, and found that if changing the code qlf <- glmQLFTest(fit, coef=2) in the first R script to qlf <- glmQLFTest(fit), then the two scripts produce similar results. Would you please indicate to me the difference of using coef=2 or not?

I really appreciate your help.

ADD REPLY
1
Entering edit mode

Hi,

The coef=2 argument means that your second group will be subtracted from the intercept of your design. Why you get similar results without, I don't know. If you want an explanation from the developers of this package you should ask this question on the bioconductor support site.

https://support.bioconductor.org/t/Latest/

ADD REPLY
0
Entering edit mode

Hi, b.nota, Thank you very much for your reply. I missed your reply these days.

Maybe you misunderstood me. I mean with coef=2 and without coef=2 produce very different results. I still cannot figure out "the intercept of your design" in your comment. Does coef=2 mean group2 vs group1? What does qlf <- glmQLFTest(fit) mean if omitting coef=2?

I really appreciate your answer. Thanks!

ADD REPLY
1
Entering edit mode

You're question is a pure statistical question, concerning linear models and intercept, etc.

I can try to explain it to you, but I am a biologist with programming skills...

Like I suggested before, try the bioconductor support site. Since edgeR is a bioconductor package, support is given on that site. Usually one of the developers or maintainers can explain you how to use the package correctly, and they respond within a day.

Good luck!

ADD REPLY
0
Entering edit mode

Thank you very much, b.nota. Your comments are very helpful!

ADD REPLY

Login before adding your answer.

Traffic: 3274 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6