Here is some minimal code snipped basically doing what the limma user guide suggests for these kinds of arrays, after downloading these text files for the PC3 cell line from the supplement (at the bottom) of https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122997
I can already tell you that there are no DEGs at all. It is beyond me why people even do n=2 microarrays, that usually never has the power to reveal anything meaningful. You can additionally do plotMDS
to see that the samples do not even cluster by treatment particularily well. Not sure what these data are good for.
library(limma)
files <- list.files("~/Downloads/GSE122997_RAW/", pattern=".txt",
full.names=TRUE)
#/ read files
elistraw <- limma::read.maimages(files, source="agilent", green.only=TRUE)
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490386_PC3_CONTROL_1.txt.gz
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490387_PC3_CONTROL_2.txt.gz
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490388_PC3_GSK126_1.txt.gz
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490389_PC3_GSK126_2.txt.gz
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490390_PC3_NPD13668_1.txt.gz
#> Read /Users/atpoint/Downloads/GSE122997_RAW//GSM3490391_PC3_NPD13668_2.txt.gz
colnames(elistraw) <- gsub("GSM*_.", "", gsub(".txt.gz", "", basename(files)))
#/ normalize
elist <- normalizeBetweenArrays(elistraw, method="quantile")
#/ define targets
elist$targets$FileName # that is the order of the samples
#> [1] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490386_PC3_CONTROL_1.txt.gz"
#> [2] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490387_PC3_CONTROL_2.txt.gz"
#> [3] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490388_PC3_GSK126_1.txt.gz"
#> [4] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490389_PC3_GSK126_2.txt.gz"
#> [5] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490390_PC3_NPD13668_1.txt.gz"
#> [6] "/Users/atpoint/Downloads/GSE122997_RAW//GSM3490391_PC3_NPD13668_2.txt.gz"
elist$targets$group <- rep(c("control", "gsk126", "npd13668"), each=2)
#/ differential
design <- model.matrix(~group, elist$targets)
fit <- lmFit(elist, design)
fit <- eBayes(fit)
results_control_gsk126 <- topTable(fit, coef=2, number=Inf)
results_control_npd13668 <- topTable(fit, coef=3, number=Inf)
#/ no differential genes at all even at 25% FDR
summary(results_control_gsk126$adj.P.Val < .25)
#> Mode FALSE
#> logical 62976
summary(results_control_npd13668$adj.P.Val < .25)
#> Mode FALSE
#> logical 62976
Actually I have several samples of this cell line and I m using only untreated/control samples and I wanted to obtain the count file of all the samples and do the rankprod analysis
hi i tried to use the code but i m getting an error in using that
files <- list.files("C:/Users/User/OneDrive/Documents/agilent/GSE122997_RAW", pattern=".txt",full.names=TRUE)
elistraw <-read.maimages(files, source="agilent", green.only=TRUE)
but i m getting an error when i try to run the second one
Error in file(file, "r") : invalid 'description' argument
i searched about the error and find out that is occurs when you try to open multiple files but i m not understanding why it is happening
hi i solved that error actually i did not made the folder so it was my mistake now it works fine thanks for the code