Limma-Voom which of two models (and p-values) is correct?
1
0
Entering edit mode
7.6 years ago

Dear Biostars, I'm analysing RNA-seq data with Limma-Voom and after reading and apply different methods of analysis I have arrived to two different results and I'm not sure which one is correct. Here the specs: n=60 (~30 per tissue) The experiment (Multi-level time course experiment):

dataForAnalysis = data.frame(Sample=c("s_1", "s_1", "s_2", "s_2", "s_1", "s_1", "s_2", "s_2"), Tissue=c("A", "A","A", "A", "B", "B", "B", "B"), Condition=c("before", "after","before", "after","before", "after","before", "after"))

Sample Tissue Condition(Sampling time)

s_1 A before

s_1 A after

s_2 A before

s_2 A after

s_1 B before

s_1 B after

s_2 B before

s_2 B after

Analysis No. 1

designMatrix = model.matrix(~ Tissue + Condition, dataForAnalysis)

nf = calcNormFactors(counts)

forDgeData.voom <- voom(counts, designMatrix, plot=False,  lib.size=colSums(counts)*nf)

forDgeData.dups = duplicateCorrelation(forDgeData.voom, designMatrix, block= dataForAnalysis$Sample)

forDgeData.fitRobust = lmFit(forDgeData.voom,  forDgeData.voom$design, correlation= forDgeData.dups$consensus, block= dataForAnalysis$Sample, method='robust')

 forDgeData.ebayes = eBayes(forDgeData.fitRobust, robust=True)

Using this method, these are my results:

summary(decideTests(forDgeData.ebayes))

(Intercept) TissueA ConditionAfter

-1 1615 6594 22

0 446 1573 15631

1 13631 7525 39

From these results, I'm taking the ones in the "ConditionAfter" to get the DE genes between the conditions using:

toptable(forDgeData.ebayes, coef="ConditionAfter", ...)

Analysis No. 2 (For this analysis I pasted tissue and condition for the design matrix and use contrasts)

predesignMatrix = factor(paste(dataForAnalysis$Tissue, dataForAnalysis$Condition, sep='.'))

designMatrix = model.matrix(~ 0 + predesignMatrix)

nf = calcNormFactors(counts)

forDgeData.voom <- voom(counts, designMatrix, plot=False,  lib.size=colSums(counts)*nf)

forDgeData.dups = duplicateCorrelation(forDgeData.voom, designMatrix, block= dataForAnalysis$Sample)

forDgeData.fitRobust = lmFit(forDgeData.voom,  forDgeData.voom$design, correlation= forDgeData.dups$consensus, block= dataForAnalysis$Sample, method='robust')

 forDgeData.ebayes = eBayes(forDgeData.fitRobust, robust=True)


designMatrix.forContrast = makeContrasts(diff_Before_vs_After = (A.after-A.before)-(B.after-B.before), levels= designMatrix)

forDgeData.fitRobust.fitRobust2 = contrasts.fit(forDgeData.fitRobust, designMatrix.forContrast)

forDgeData.ebayes = eBayes(forDgeData.fitRobust.fitRobust2)

Using this method, these are my results:

summary(decideTests(forDgeData.ebayes))

diff_Before_vs_After

-1 7

0 15680

1 5

So, comparing results from Analysis No. 1 and No. 2, only 3 genes overlap. My main question here is if there are DE genes between the conditions (before and after), taking into account the tissue factor and also that the samples are duplicated.

Any help or advice... more than welcome!

Thanks

RNA-Seq rna-seq • 2.3k views
ADD COMMENT
0
Entering edit mode
7.6 years ago

In the first model, you're asking for DE genes due to Condition. In the second model, you're asking for DE genes due to the interaction between Condition and Tissue. These are completely different questions, so it's unsurprising that there's a difference.

ADD COMMENT

Login before adding your answer.

Traffic: 3296 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6