I am following the tutorial given here to perform RMA normalization of dataset GSE1133 using the following code in R. cdf libraries of platform gpl1073 and gpl1074 have been installed following the description given here
library(Biobase)
library(GEOquery)
library(affy)
library(limma)
library(hgu133a.db)
library(hgu133acdf)
library(gpl1073cdf)
library(gpl1074cdf)
# Filter samples of platform 1073
eset <- getGEO("GSE1133", GSEMatrix=TRUE, getGPL=TRUE)
if(length(eset) > 1)
idx <- grep("GPL1073", attr(eset, "names")) else idx <- 1
eset <- eset[[idx]]
phenoData(eset)
files <- sampleNames(eset) # only cel files of platform GPL1073
setwd("/home/GSE1133")
#Download data
getGEOSuppFiles("GSE1133")
setwd("/home/GSE1133/GSE1133")
untar("GSE1133_RAW.tar", exdir="data")
cels = list.files("data/", pattern="CEL")
pattern <- paste(files, sep="", collapse="|")
cels <- grep(pattern, cels, value=TRUE)
# RMA normalization
raw.data = ReadAffy(verbose=FALSE, filenames=cels, cdfname="gpl1073cdf")
data.rma.norm = rma(raw.data)
However, after running this code I get an error that says some of the cel files are not valid. Could someone look into this? I am accessing the files that are downloaded using GEOquery and I am not sure why this error occurs.
Error: the following are not valid files:
GSM18584.CEL.gz
GSM18585.CEL.gz
:
: