How to RMA normalize microarray expression data only without the CEL files?
Entering edit mode
16 months ago
Microuser • 0


I have used getGEO function to get several related microarray datasets, extracted the expression values of each and merged them. Now I have the expression values, with geo accessions as columns and probe sets in rows. I want to do RMA normalisation with oligo package, but I cannot as not all CEL files are available for creating an affybatch object. Is there any other way for RMA normalisation?

Thank you in advance.

microarray oligo affy rma • 955 views
Entering edit mode

It is highly likely that the data retrieved via getGEO() is already normalised. Please check the 'Data processing' part of a sample's record on GEO itself, where it should indicate this information.

All CEL files should nevertheless be available for download, should you want to start from that stage. You are implying that all are not available?

To help, please share the commands that you may have already used.

Entering edit mode

Thanks Kevin, yes not all the CEL files are available, neither on the GEO site nor it can be found through getGEOSuppFiles function. I have checked the expression data and they are not normalised and I would like to normalise all datasets together.

what I would like to do is: merged_file: are combinations of several files each generated with the following commands:

gset <- getGEO('GSE', GSEMatrix = TRUE)
file_1 <- exprs(gset)

for example, I merged file_1, file_2, ...., etc. and created merged_file.

library( oligo)
eSet <- rma(merged_file)
data = exprs(eSet)

But I get this error: Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘rma’ for signature ‘"matrix"’

I think the problem is not having an affy batch object as I usually start with:

celfiles <- list.celfiles()
affyobject <- read.celfiles(celfiles)

and then I use the above functions to normalise.

Entering edit mode

If you are absolutely sure that the data that you have retrieved is just the raw signal intensities with no other processing, then that is extremely rare for data on GEO and does mean that RMA has to be applied. Have you confirmed 100% that the data is indeed the raw signal data? Unfortunately you have not provided the GSE number, so, I cannot check on my side.

Sometimes, authors provide the background-corrected and quantile normalised data, which then just requires a log [base 2] transformation.

Performing RMA manually on an expression matrix is advanced and requires you to construct a GeneFeatureSet or similar object from scratch.


Login before adding your answer.

Traffic: 720 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6