Interpreting edgeR DGE results - which condition is the logFC referring to?
2
0
Entering edit mode
2.4 years ago

Basically what the title says - when interpreting edgeR results, which condition is the logFC referring to?

For example see screenshot below

Does this mean these genes are upregulated in condition A (vs B)?

example pic

logFC DGE edgeR • 1.3k views
ADD COMMENT
0
Entering edit mode
2.4 years ago

Show your makeContrasts()

ADD COMMENT
0
Entering edit mode

Sorry I don't know what this means. I used a perl script from Trinity to run edgeR

$TRINITY_HOME/Analysis/DifferentialExpression/run_DE_analysis.pl

ADD REPLY
0
Entering edit mode
2.4 years ago
ATpoint 81k

Unless the script did something custom the reference level (so the denominator) is always the condition that alphabetically comes first. That is because these conditions are converted to factors and the factor levels are sorted alphabetically. So here it is probably conditionB/conditionA so a positive logFC means higher in B.

That is the relevant line:

https://github.com/trinityrnaseq/trinityrnaseq/blob/master/Analysis/DifferentialExpression/run_DE_analysis.pl#L603

Seems it is just standard settings, so A should be the reference (=denominator), I think. If that tool returns the normalized counts you can simply check for some genes whether a positive logFC means higher counts in samples of B rather than A.

ADD COMMENT
0
Entering edit mode

Huh ok thanks, weirdly it seems to be the other way around! Positive logFC means higher in condition A.

Here is a screenshot of the DE subset file that is outputted automatically and also contains columns with the counts

DE-subset

ADD REPLY
3
Entering edit mode

At least you have your answer :)

Just for context how it "normally looks", not sure what this script of yours is doing:

library(edgeR)

y <- DGEList(counts=matrix(rnbinom(5000*4,mu=5,size=2),5000,4), 
             group=rep(c("conditionA", "conditionB") ,each=2))
rownames(y) <- paste("gene", 1:nrow(y), sep="_")

design <- model.matrix(~group, y$samples)
v <- voom(y, design)
fit <- lmFit(v, design)
fit <- eBayes(fit, robust=TRUE)

tt <- topTable(fit, coef=2)  %>% data.frame(Gene=rownames(.), .)
cp <- cpm(y) %>% data.frame(Gene=rownames(.), .)

#/ 1/2 is "A" (=the reference) and 3/4 is "B", positive logFC mean higher in B,
merge(x=tt, y=cp, by="Gene")[2,c(1,2,8,9,10,11)]
ADD REPLY

Login before adding your answer.

Traffic: 2937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6