Soo this question might sound stupid, but I have some trouble understanding how to interpret my logFC values.
Let's assume I have my normalized data from the affy chip (eset), a design matrix and want to identify differentially expressed genes between a Treated and a Control group:
contrastm <- makeContrasts(Treated - Control, levels = design) fit <- lmFit(exprs(eset), design) fit2 <- contrasts.fit(fit, contrastm) fit2b <- eBayes(fit2)
Then I call topTable(fit2b, coef = 1) and get a logFC value of -9.8 for a gene. This means my gene is 2^(-9.8) fold downregulated in the Treated group, right?
So of course if I change the command makeContrasts to Control - Treated I get a logFC of 9.8, which is the value described in the publication I am working with... so which one is right?
Edit: Here is the current problem...:
Sorry for the delay, had a couple other things in between. So I ran the following script at home..It will download the GEOdataset (~115MB), untar it and create a annotation.txt in a new folder:
# libraries needed library(Biobase) library(GEOquery) library(limma) library(simpleaffy) dir.create("BioStars_script") setwd("BioStars_script/") ### This might take a while... geoset <- getGEOSuppFiles("GSE28914") ### untar files in your current workfolder untar("GSE28914/GSE28914_RAW.tar") ### Creates an Annotation.txt for read.affy Annotation_dummy <- read.delim("GSE28914/filelist.txt") ### Sample infos according to the GEO profile, cel files should be ordered in increasing number Annotation <- data.frame(File = Annotation_dummy$Name[-1], Name = Annotation_dummy$Name[-1], Target = c("intact", "acute", "POD3", "intact", "POD3", "intact", "acute", "POD3", "intact", "acute", "POD7", "intact", "acute", "POD7", "intact", "POD3", "POD7", "intact", "acute", "POD3", "POD7", "intact", "acute", "POD3", "POD7")) write.table(Annotation, "Annotation.txt", quote = F, row.names = F, sep = "\t") ### Begin import wound_data <- read.affy(covdesc = "Annotation.txt") ### RMA normalization rma.set <- rma(wound_data) ### Filter probes, remove duplicates, etc e.anig <- nsFilter(rma.set, require.entrez = T, remove.dupEntrez = T) ### Create Model f <- factor(as.factor(rma.set$Target), levels = levels(as.factor(rma.set$Target))) design <- model.matrix(~0 + f) colnames(design) <- levels(as.factor(c("acute", "intact", "POD3", "POD7"))) #### define the comparisons that you would like to do contrast <- makeContrasts(d3 = POD3 - intact, d7 = POD7 - intact, d0 = acute - intact, levels = design) #### fit a linear model for each gene over all microarrays fit <- lmFit(exprs(e.anig$eset), design) ### compute coefficients and standard errors for the defined contrasts fit2 <- contrasts.fit(fit, contrast) ### compute statistics for differential gene expression (moderated t-test, and others) fit2 <- eBayes(fit2) ### Top differentially expressed genes, 204475_at is MMP1 for example topTable(fit2)
Now what really confuses me: I get the correct results as described in the paper! My work machine, running the same script however spits out different results... The only differences were:
- my work machine is blocked from using GEOquery, so I had to download the GEOdataset manually (checked this at home)
- the order of the colnames(design) at home is different than at work, so I adjusted the script accordingly
This is my sessionInfo at home:
R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale:  LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8  LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C  LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  hgu133plus2.db_2.14.0 org.Hs.eg.db_2.14.0 RSQLite_1.0.0 DBI_0.3.1 AnnotationDbi_1.26.1  GenomeInfoDb_1.0.2 hgu133plus2cdf_2.14.0 simpleaffy_2.40.0 gcrma_2.36.0 genefilter_1.46.1  affy_1.42.3 limma_3.20.9 GEOquery_2.30.1 Biobase_2.24.0 BiocGenerics_0.10.0 loaded via a namespace (and not attached):  affyio_1.32.0 annotate_1.42.1 BiocInstaller_1.14.3 Biostrings_2.32.1 IRanges_1.22.10  preprocessCore_1.26.1 RCurl_1.95-4.3 splines_3.1.2 stats4_3.1.2 survival_2.37-7  tools_3.1.2 XML_3.98-1.1 xtable_1.7-4 XVector_0.4.0 zlibbioc_1.10.0
At work I also use R 3.1.2 with all packages up to date, but a Windows-32bit machine. (I just checked earlier today and reran the script after updating...) Sessioninfo will follow in the morning.
Anyone got a clue what could be wrong?