Question

RNAseq: how to make design matrix for limma

0

Entering edit mode

4.8 years ago

vw ▴ 40

Hi all,

I want to compare the result between limma + voom with DESeq2. But I don't know how to make the design matrix for my date when I am using Limma.

I have two groups: WT, ABX. Each group has 6 samples: X1~X6(WT), X7~X12(ABX). It looks like:

head(exprSet, n=1)
                        X.1     X.2       X.3     X.4       X.5     X.6       X.7     X.8
ENSMUSG00000051951      88      113       83      101       57       45       45       50
                         X.9       X.10      X.11      X.12
ENSMUSG00000051951       31        51        59        43

The row name is the geneID, the column is the count for each sample.

My code is below:

## read count matrix from file
exprSet=read.table("ex_matrix_g1g2_h.txt", sep="\t", header=T, row.names = 1, stringsAsFactors = F)
group_list <- factor(c(rep("WT",6), rep("ABX",6)))

design <- model.matrix(~0+group_list) ## Is this correct?
colnames(design) <- levels(group_list)
rownames(design) <- colnames(exprSet)

v <- voom(exprSet,design,normalize="quantile", plot = T)
fit <- lmFit(v,design)
fit2 <- eBayes(fit)
tempOutput = topTable(fit2, n=Inf, adjust.method = 'BH', coef=2)  
## if I want to compare ABX - WT, how to set coef?

Unfortunately, the above code generated very different results compared with DESeq2. In limma's result, almost all genes are significantly different.

I think that maybe I used the wrong design matrix. But I am not familiar with limma. Could some help me to figure out what the matter of my code?

Thank you so much!

Limma RNA-Seq DESeq2 • 4.0k views

ADD COMMENT • link updated 4.8 years ago by Kristoffer Vitting-Seerup ★ 4.0k • written 4.8 years ago by vw ▴ 40

score 4 · Accepted Answer · 2019-06-28

That is because you do not compare the two conditions but test whether ABX is expressed in the first place (whether ABX gene expression different from zero) . It's quite easy to fix - try this instead:

exprSet=read.table("ex_matrix_g1g2_h.txt", sep="\t", header=T, row.names = 1, stringsAsFactors = F)

group_list <- factor(
    x = c(rep("WT",6), rep("ABX",6)),
    levels=c("WT","ABX")                     # Set WT to be the first level
)

design <- model.matrix(~group_list)          # Remove the zero

v <- voom(exprSet,design,normalize="quantile", plot = T)
fit <- lmFit(v,design)
fit2 <- eBayes(fit)

tempOutput = topTable(
    fit2, n=Inf,
    adjust.method = 'BH',
    coef='group_listABX'                     # Easier to read in 6 months than coef=2.
)

By setting WT to be the first level and removing the zero from the design WT will be set to be the intersect - and then when you test the "group_listABX" you are comparing ABX to the intersect (aka you compare WT -> ABX). This also means that positive log2FC values are higher expressed in ABX.

I assumed that you want to do the WT -> ABX comparison. If you did not set WT to be the first levels ABX would be that instead (A is before W in the alphabet) meaning you would compare ABX -> WT, which would mean positive log2FC values would be higher expressed in WT.

Hope this helps.

Cheers Kristoffer