limma voom -- all genes after filtering are differentially expressed
1
0
Entering edit mode
22 months ago
wiscoyogi ▴ 40

Hello, I am attempting to run a differential expression using limma voom on a set of 10 samples, 5 in condition A and 5 in condition B.

I'm getting some bizarre results, namely that:

  1. All the input genes to the program after expression filtering (9461) come out as differentially expressed
  2. My volcano plot is exceedingly lopsided (e.g. no negative log fold changes)
  3. There’s no ‘logFC’ column in the topTable output.

I'm following the documentation verbatim to the best of my knowledge.

Experimental design: samples were split and treated with either treatment A or treatment B then RNA sequenced.

  • The ratio of the maximum to minimum library size is 10.
  • My plan was to create a design matrix which includes a coefficient for the A vs. B difference (like p 41 of the limma voom guide)
  • My plan was to run voomWithQualityWeights. The voom mean-variance trend indicates that adequate gene filtering was performed.

The ordering of the rows in the design matrix match the ordering of the rows in the counts matrix. My voom: mean-variance trend after running voomWithQualityWeights indicated that genes were adequately filtered.

enter image description here

counts <- read.csv("counts.txt")
dge <- DGEList(counts = counts)

# make the design matrix
A <- c()
BvsA <- c()
for (i in colnames(counts)) {
    A <- append(A, 1)
    if grepl("B", i, fixed = TRUE) {
    BvsA <- append(BvsA, 0)
    } else {
    BvsA <- append(BvsA, 1)
    }
}

design <- cbind(A, BvsA)

# filter lowly expressed genes
keep <- filterByExpr(dge, design)

dge <- dge[keep, , keep.lib.sizes = FALSE]
dge <- calcNormFactors(dge, method = "TMM")

vwts <- voomWithQualityWeights(counts = dge, design = design,
                               plot = TRUE, method = "genebygene") # dge vs logCPM
vfit2 <- lmFit(vwts)
vfit2 <- eBayes(vfit2)

voomDEG <- topTable(vfit2, number = Inf, sort.by = "B", resort.by = "logFC", p.value = 0.1, adjust = "BH")

I'm so confused why all the genes are coming out as differentially expressed. If I run raw correlation analyses on the samples (B vs. A) I get a great correlation, which would indicate that not all the genes are DEG.

Does anyone have any indications for what's going on here?

expression differential voom RNAseq limma • 1.2k views
ADD COMMENT
1
Entering edit mode
22 months ago
Gordon Smyth ★ 7.0k

The problem is that you are setting up a custom design matrix and then testing every coefficient equal to zero. You haven't tested for differential expression at all.

You say that you are following page 41 of the limma User's Guide. The Guide gives this code:

> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef="MUvsWT", adjust="BH")

If you want to adapt this code to your data, then you would need coef="BvsA". If you had set coef in the topTable call then everything would have been correct.

Your design matrix code is very, very unusual. The standard way to make the design matrix would be

design <- model.matrix(~group)

where group is a factor with levels A and B. If you had made the design matrix in this standard way, then limma would have been able to figure out automatically which coefficient you wanted to test for.

You can easily create a group factor for example by:

group <- factor(grepl("B", colnames(counts)))
levels(group) <- c("A","B")
ADD COMMENT
0
Entering edit mode

thank you so much!!! noted on the design matrix and the importance of specifying the coefficient in topTable!

ADD REPLY

Login before adding your answer.

Traffic: 2052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6