Question: how to decide whether intercept term is necessary for GLM model or not
1
4.7 years ago by
illinois.ks140
Korea, Republic Of
illinois.ks140 wrote:

Hello,

I have read several materials including the edgeR and GLM manual. I am wondering whether I need to add intercept into the mode.

I am reading the edgeRUserGuid maual..for example, suppose we have such data composing from 12 samples..

==============================

> targets
Sample Treat Time
1 Sample1 Placebo 0h
2 Sample2 Placebo 0h
3 Sample3 Placebo 1h
4 Sample4 Placebo 1h
5 Sample5 Placebo 2h
6 Sample6 Placebo 2h
7 Sample1 Drug 0h
8 Sample2 Drug 0h
9 Sample3 Drug 1h
10 Sample4 Drug 1h
11 Sample5 Drug 2h
12 Sample6 Drug 2h

========================================================

Two Design matrix looks like this..

> Group <- factor(paste(targets\$Treat,targets\$Time,sep="."))
> cbind(targets,Group=Group)

> design1 <- model.matrix(~0+Group)

> colnames(design1)

[1] "GroupDrug.0h"    "GroupDrug.1h"    "GroupDrug.2h"    "GroupPlacebo.0h" "GroupPlacebo.1h" "GroupPlacebo.2h"

> fit1 <- glmFit(y, design1)

=========================================

> design2 <- model.matrix(~Treat + Treat:Time, data=targets)

colnames(design2)
[1] "(Intercept)" "TreatDrug"
[3] "TreatPlacebo:Time1h" "TreatDrug:Time1h"
[5] "TreatPlacebo:Time2h" "TreatDrug:Time2h"

> fit2 <- glmFit(y, design2)

==================================

What is the diffference for

glmLRT(fit1, coef=2)        VS.       glmLRT(fit2, coef=4)     ????????????????????

Are they testing the same thing, which find genes resposive  for Drug at time 1h...(compared to time 0).?????????????? If it is same, I think there is no reason for intercept term. If it is not, what is the difference for those tests??

===================================================

rna-seq • 3.7k views
modified 3.9 years ago by Biostar ♦♦ 20 • written 4.7 years ago by illinois.ks140
4
4.7 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

They're in no way the same thing. Without an intercept, you're basically asking of a gene has signal above 0 in a particular group (i.e., glmLRT(fit1,coef=2) is unlikely to answer your biological question). If you want to compare (i.e., contrast) a number of the groups, such as GroupPlacebo.1h vs. GroupDrug.1h, then not having an intercept can make that a little easier.

I should note that times like this are when having taken a bit of linear algebra comes in quite handy. If you're affiliated with a university and have never taken linear algebra, I would highly recommend it. You'll find many statistical procedures (such as what a model matrix is actually doing) much simpler then.