**330**wrote:

Hi

I thought I had contrast matrices and limma worked out, but it appears I don't. The data frame is a series of columns (samples) with rows (genes) and I am splitting this up according on the different groups to compare which are in order.

len1 <- 7

len2 <- 12

len3 <- 5

group <- factor(c(rep(1, times = len1), rep(2, times = len2), rep(3, times = len3)))

design <- model.matrix(~0+group)

matrix <- data.matrix(data6)

colnames(design)[1] <- 's1'

colnames(design)[2] <- 's2'

colnames(design)[3] <- 's3'

contrast.matrix <- makeContrasts('s1-s2','s2-s3','s1-s3', levels = design)

fit <- lmFit(matrix,design)

fit2 <- contrasts.fit(fit, contrast.matrix)

fit2 <- eBayes(fit2)

top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")

sig <- subset(top2, adj.P.Val<0.05)

nrow(sig)

This gives me 400 differentially expressed genes. All I want to do is compare s1 gene profile with s2. However, when I exclude s3 (remove it from the data frame) and just run the following:

group <- factor(c(rep(1, times = len1), rep(2, times = len2)))

design <- model.matrix(~group) # from EdgeR

matrix <- data.matrix(data6)

#

fit <- lmFit(matrix,design)

fit <- eBayes(fit)

top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")

sig <- subset(top2, adj.P.Val<0.05)

This gives a 1000 differentially expressed genes.

So the two methods are different, but all I thought I was doing in the first step was comparing 2 gene profiles so why the difference?

Thanks.

on the second linear model fit, did you subset your input matrix to only include s1 and s2 samples? - The other thing is that you have a non intercept model in the first design, and an intercept model on the second. That's most likely where you're seeing differences.

5.9kIt's likely that shrinkage changed due to the removal of the samples from group 3, so put them back in the model and do:

and see what you get.

94kYes the S3 group is a lot more variable and it is effecting the estimates of DE genes between S1 and S2 because the variance is being shared. Thanks.

330