Question: Which model to choose in edgeR analysis?
0
gravatar for biolab
3.3 years ago by
biolab1.1k
biolab1.1k wrote:

Dear all,

Our RNA-seq data has two samples, each of which has three biological replicates. I have two edgeR scripts as follows. I found that they produced different results. My question is: which model (classical linear or GLM) to choose in edgeR analysis?

I wish biostars experts on R analysis could provide me some comments or suggestions. Thank you very much .

- R script 1

d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- calcNormFactors(dge)
design <- model.matrix(~g)
dge <- estimateDisp(dge, design)
fit <- glmQLFit(dge, design)
qlf <- glmQLFTest(fit, coef=2)
topTags(qlf, n=100000)

R script 2

d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
et <- exactTest(dge)
topTags(et, n=100000)
edger • 3.0k views
ADD COMMENTlink modified 3.3 years ago by Benn7.9k • written 3.3 years ago by biolab1.1k
1
gravatar for Benn
3.3 years ago by
Benn7.9k
Netherlands
Benn7.9k wrote:

This was once asked on the bioconductor support site:

https://support.bioconductor.org/p/64807/

ADD COMMENTlink written 3.3 years ago by Benn7.9k
1

And the difference between glmFit and glmQLFit is described here:

https://support.bioconductor.org/p/76790/

ADD REPLYlink written 3.3 years ago by Benn7.9k

Hi, b.nota,

Thank you very much for your reply.

I just performed a variety of tests, and found that if changing the code qlf <- glmQLFTest(fit, coef=2) in the first R script to qlf <- glmQLFTest(fit), then the two scripts produce similar results. Would you please indicate to me the difference of using coef=2 or not?

I really appreciate your help.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by biolab1.1k
1

Hi,

The coef=2 argument means that your second group will be subtracted from the intercept of your design. Why you get similar results without, I don't know. If you want an explanation from the developers of this package you should ask this question on the bioconductor support site.

https://support.bioconductor.org/t/Latest/

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by Benn7.9k

Hi, b.nota, Thank you very much for your reply. I missed your reply these days.

Maybe you misunderstood me. I mean with coef=2 and without coef=2 produce very different results. I still cannot figure out "the intercept of your design" in your comment. Does coef=2 mean group2 vs group1? What does qlf <- glmQLFTest(fit) mean if omitting coef=2?

I really appreciate your answer. Thanks!

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by biolab1.1k
1

You're question is a pure statistical question, concerning linear models and intercept, etc.

I can try to explain it to you, but I am a biologist with programming skills...

Like I suggested before, try the bioconductor support site. Since edgeR is a bioconductor package, support is given on that site. Usually one of the developers or maintainers can explain you how to use the package correctly, and they respond within a day.

Good luck!

ADD REPLYlink written 3.3 years ago by Benn7.9k

Thank you very much, b.nota. Your comments are very helpful!

ADD REPLYlink written 3.3 years ago by biolab1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1720 users visited in the last hour