Question: how I can read CEL files with affy
1
gravatar for F
2.4 years ago by
F3.4k
Iran
F3.4k wrote:

hi,

I am trying to read my CEL files from [Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array but I get this error

library(affy)

Data<-ReadAffy()

Error in affyio::read_abatch(filenames, rm.mask, rm.outliers, rm.extra, : Cel file C:/Users/Lenovo/Desktop/GSE50833_RAW/GSE10000_RAW/GSM44660.CEL/GSM252007.CEL does not seem to have the correct dimensions

celfiles<- list.files("GSE10000/CEL", full = TRUE)

rawData<- read.celfiles(celfiles)

All the CEL files must be of the same type. Error: checkChipTypes(filenames, verbose, "affymetrix", TRUE) is not TRUE

how to read my CEL files for normalization?

R software error • 4.3k views
ADD COMMENTlink modified 14 months ago by Biostar ♦♦ 20 • written 2.4 years ago by F3.4k
1

Hi, F

It seems that it is a dimension problem.

ADD REPLYlink written 2.4 years ago by Farbod3.2k
5
gravatar for ddiez
2.4 years ago by
ddiez1.7k
Japan
ddiez1.7k wrote:

The code you show is intriguing because you have in the error message a reference to the GSE10000 dataset, which is indeed Mouse430_2 arrays, but also to GSE50833, which are Agilent-028005 SurePrint G3 arrays. At any rate, the error suggests that you are trying to read different arrays with ReadAffy() and that fails because different arrays have different dimensions. First thing I would try myself is to make sure that all the files are of the same platform/array.

EDIT

I could replicate the problem and confirm my guess using the following experiment (there must be a better way to do this than reading the whole set of files one by one):

f <- list.files(pattern = "CEL.gz")
celf <- lapply(f, function(x) ReadAffy(filenames = x))
table(sapply(celf, annotation))
 mouse4302 mouse430a2 
        18         17

The solution is to read them separately.

EDIT 2

OK, this is the most effective (fast) way to check the chip type of a bunch of cel files:

library(affyio)
f <- list.files(pattern = "CEL.gz")
table(sapply(f, function(x) read.celfile.header(x)$cdfName))
 Mouse430_2 Mouse430A_2 
         18          17

EDIT 3

And this is how you can use the information above to read the files in different batches:

ff <- split(f, sapply(f, function(x) read.celfile.header(x)$cdfName))
ff
$Mouse430_2
 [1] "GSM250879.CEL.gz" "GSM250880.CEL.gz" "GSM250881.CEL.gz" "GSM250882.CEL.gz" "GSM250919.CEL.gz" "GSM250920.CEL.gz"
 [7] "GSM250922.CEL.gz" "GSM250923.CEL.gz" "GSM250925.CEL.gz" "GSM250927.CEL.gz" "GSM250928.CEL.gz" "GSM250943.CEL.gz"
[13] "GSM44658.CEL.gz"  "GSM44659.CEL.gz"  "GSM44660.CEL.gz"  "GSM44661.CEL.gz"  "GSM44662.CEL.gz"  "GSM44663.CEL.gz" 

$Mouse430A_2
 [1] "GSM252007.CEL.gz" "GSM252008.CEL.gz" "GSM252009.CEL.gz" "GSM252010.CEL.gz" "GSM252011.CEL.gz" "GSM252014.CEL.gz"
 [7] "GSM252015.CEL.gz" "GSM252016.CEL.gz" "GSM252017.CEL.gz" "GSM252018.CEL.gz" "GSM252021.CEL.gz" "GSM252022.CEL.gz"
[13] "GSM252033.CEL.gz" "GSM252040.CEL.gz" "GSM252051.CEL.gz" "GSM252052.CEL.gz" "GSM252053.CEL.gz"

library(affy)
abatch1 <- ReadAffy(filenames = ff$Mouse430_2)
abatch2 <- ReadAffy(filenames = ff$Mouse430A_2)

And so on.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by ddiez1.7k
2

I think "GSE50833_RAW/GSE10000_RAW/" is just an inappropriate folder naming.

ADD REPLYlink written 2.4 years ago by Farbod3.2k
1

Absolutely. Better to keep each dataset in its own folder.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by ddiez1.7k

thank you, but GSE50833_RAW is name of my folder in which GSE10000 located :( :( :(

ADD REPLYlink written 2.4 years ago by F3.4k
1

I see. But nothing prevents you from moving the folder to its own location, right? Not trying to impose my own logic about file organization (mainly because many times it is far from perfect or rational) but, in this particular case, I would keep them datasets in different folders.

ADD REPLYlink written 2.4 years ago by ddiez1.7k

thank you

f <- list.files(pattern = "CEL.gz")

celf <- lapply(f, function(x) ReadAffy(filenames = x))

table(sapply(cdfs, annotation))

Error in sapply(cdfs, annotation) : object 'cdfs' not found

eset<-rma(celf)

Error in (function (classes, fdef, mtable) :

unable to find an inherited method for function ‘rma’ for signature ‘"list"’

Data<-ReadAffy()

Error in affyio::read_abatch(filenames, rm.mask, rm.outliers, rm.extra, :

Cel file C:/Users/Lenovo/Desktop/GSE10000_RAW/GSM252007.CEL.gz does not seem to have the correct dimensions

ADD REPLYlink written 2.4 years ago by F3.4k
2

Sorry, I made a last moment change in variable name without checking if it worked and that is always a bad idea. Instead of cdfs you have to have celf. Anyway, I added a better way to do the same thing without having to read all the files (which is way more efficient if you have lots of files).

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by ddiez1.7k

thank you,

your second edition passed without error

library(affyio)

f <- list.files(pattern = "CEL.gz")

table(sapply(f, function(x) read.celfile.header(x)$cdfName))

Mouse430_2 Mouse430A_2 18 17

sorry, hereafter how I can carry on normalization?

I want to run

Data<-ReadAffy()

eset<-rma(Data)

but I can't figure out how to relate table with readaffy

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by F3.4k
1

Well, it is difficult to answer. I would use the information above about the different platform to read the two sets of files separately (i.e. two calls to ReadAffy) and process them independently (two calls to rma). Then you will have to figure out how to combine them. Maybe extract the matrices and put them together with missing values for the non-matching probesets? It is possible but not sure whether is a good idea. But, anyway, the first thing I would do is to find out how the original authors did it. The dataset (GSE10000) has been published and in the paper they may say something about the two different platforms.

ADD REPLYlink written 2.4 years ago by ddiez1.7k
1

Added some extra help in my answer regarding how to use the information about the platform to read the files.

ADD REPLYlink written 2.4 years ago by ddiez1.7k

thank you, since yesterday I got confused finally you clarified the source of error

ADD REPLYlink written 2.4 years ago by F3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2119 users visited in the last hour