Question: Limma contrast matrix help
1
gravatar for chris86
3.6 years ago by
chris86250
United Kingdom, London
chris86250 wrote:

Hi

I thought I had contrast matrices and limma worked out, but it appears I don't. The data frame is a series of columns (samples) with rows (genes) and I am splitting this up according on the different groups to compare which are in order.

len1 <- 7
len2 <- 12
len3 <- 5

group <- factor(c(rep(1, times = len1), rep(2, times = len2), rep(3, times = len3)))

design <- model.matrix(~0+group)
matrix <- data.matrix(data6)
colnames(design)[1] <- 's1'
colnames(design)[2] <- 's2'
colnames(design)[3] <- 's3'

contrast.matrix <- makeContrasts('s1-s2','s2-s3','s1-s3', levels = design)

fit <- lmFit(matrix,design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)
nrow(sig)

This gives me 400 differentially expressed genes. All I want to do is compare s1 gene profile with s2. However, when I exclude s3 (remove it from the data frame) and just run the following:

 

group <- factor(c(rep(1, times = len1), rep(2, times = len2)))
design <- model.matrix(~group) # from EdgeR
matrix <- data.matrix(data6)

#

fit <- lmFit(matrix,design)
fit <- eBayes(fit)
top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)

 

This gives a 1000 differentially expressed genes.

So the two methods are different, but all I thought I was doing in the first step was comparing 2 gene profiles so why the difference?

Thanks.

next-gen • 2.0k views
ADD COMMENTlink modified 3.6 years ago by Biostar ♦♦ 20 • written 3.6 years ago by chris86250

on the second linear model fit, did you subset your input matrix to only include s1 and s2 samples? - The other thing is that you have a non intercept model in the first design, and an intercept model on the second. That's most likely where you're seeing differences. 

ADD REPLYlink written 3.6 years ago by andrew.j.skelton735.5k

It's likely that shrinkage changed due to the removal of the samples from group 3, so put them back in the model and do:

design <- model.matrix(~group)
...stuff...
top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")

and see what you get.

 

ADD REPLYlink written 3.6 years ago by Devon Ryan88k

Yes the S3 group is a lot more variable and it is effecting the estimates of DE genes between S1 and S2 because the variance is being shared. Thanks.

ADD REPLYlink written 3.6 years ago by chris86250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour