I have performed a differential expression analysis with edgeR using two groups: mutant and control.
In the definition of the experimental factors, I have put the control as the reference just like the Arabidopsis case study in the edgeR manual:
strain <- factor(substring(colnames(data.set),1,7))
strain <- relevel(strain, ref="control")
Downstream in the edgeR analysis, I get a list of differentially expressed genes where some of them have a positive logFC value, indicating upregulation.
I naively think that since the reference is control, it must mean that the genes with a positive logFC value are upregulated in the mutant compared with the control.
This is the impression I get from reading the R documentation for the relevel() function and also the Arabidopsis case study in the edgeR manual. I have also checked a few genes with a positive logFC value against a TMM-normalized gene counts matrix and they have higher expression values in mutant than control.
However, I want to avoid making one of those catastrophic errors, so I wanted to ask just to be safe.
Are these genes with positive logFC in this setup upregulated in the mutant or the control?
Is this decided by which level is set as ref in the definition of experimental factors?
- Had the mutant been set as ref, would it have been the other way around?