Question: How to process RMA data of gene microarry ?
0
gravatar for Laura_zz
6 days ago by
Laura_zz0
Laura_zz0 wrote:

I get RMA data from ArrayExpress, and it's single-channel data. I don't know much about gene chips, so I checked some information and tried to run the script provided by the limma package myself, but I don't know if it’s right.

  1. One step of RMA data is to run exprs(RMA data), but I can't run it, so I use log2(RMA data+1) to replace function exprs(), is that correct?
  2. Should RMA data processing be processed first with log2 RMA data+1, and then with lmfit in the limma package for differential expression calculation?
rna-seq R • 138 views
ADD COMMENTlink modified 4 days ago by Kevin Blighe43k • written 6 days ago by Laura_zz0
1

Your data is likely already normalised and log2-transformed. Can you please show all data processing commands that you have used?

exprs() has nothing to do with the RMA process itself. exprs() is just a function that accesses a 'slot' (variable) in an ExpressionSet R object - this 'slot' contains the expression data.

RMA normalisation involves (in this order)

  1. background correction
  2. quantile normalisation
  3. log2 transformation
ADD REPLYlink modified 6 days ago • written 6 days ago by Kevin Blighe43k

A<- read.csv(file="RMAdata.csv",header=TRUE,as.is=T)

A2<-A[,2:22]

A2<-as.matrix(A2)

rownames(A2)<-A[,1]

A3=log2(A2+1)

kk<- read.csv(file="design.csv",header=TRUE,as.is=T)

kk1<-kk[,2:11]

rownames(kk1)<-kk[,1]

mode(kk)<-"numeric"

fit<-lmFit(A3,kk1)

cont.matrix <-makeContrasts(LeafD0.5h-LeafD0h,levels=kk1)

fit2<-contrasts.fit(fit, cont.matrix)

fit2<-eBayes(fit2)

LeafD0.5hVSLeafD0h<-topTable(fit2, adjust="BH",n=30000)

write.csv(LeafD0.5hVSLeafD0h,file="LeafD0.5hVSLeafD0h.csv")

In this commands, I use log2() to replace exprs(). RMA data was derived from ArrayExpress, so if I use this data directly, The fold change in differential expression will be thousands to tens of thousands of times, so I'm confused if the data can be used of differential expression analysis directly.

The protocol description of this experiment showed that The raw data (.probe file) was subjected to RMA (Robust Multi-Array Analysis; Irizarry et al. Biostatistics 4(2):249), quantile normalization (Bolstad et al. Bioinformatics 19(2):185), and background correction as implemented in the NimbleScan software package, version 2.4.27 (Roche NimbleGen, Inc.).

Experiment description link: https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-38130/?keywords=&organism=Oryza+sativa&exptype%5B%5D=%22rna+assay%22&exptype%5B%5D=&array=&page=1&pagesize=500&tdsourcetag=s_pctim_aiomsg

ADD REPLYlink modified 5 days ago • written 5 days ago by Laura_zz0

In this commands, I use log2() to replace exprs().

Hey, I am not sure what you mean by this ^

log2() and exprs() perform different things - one cannot replace the other. Here is what the description in the manual page of esprs() says:

Description:
     These generic functions access the expression and error
     measurements of assay data stored in an object derived from the
     ‘eSet-class’.

It would be easier to use GEO2R to obtain this data - the EBI has no great automated way to obtain published expression datasets - NCBI's GEO does.

  1. Go to main accession page (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38130)
  2. Click on the blue button ANALYZE WITH GEO2R
  3. Click on the R script tab

There, you will find code to automatically obtain the data:

################################################################
#   Boxplot for selected GEO samples
library(Biobase)
library(GEOquery)

# load series and platform data from GEO

gset <- getGEO("GSE38130", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL15594", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

It seems that, for this project, the data is not log transformed, so, you need to log-transform it like this:

exprs(gset) <- log2(exprs(gset))

boxplot(exprs(gset), boxwex=0.7, notch=T, main=title, outline=FALSE, las=2)

jj

hist(exprs(gset))

k

Techniclly speaking, the description given by the authors of their data processing steps is incorrect. RMA normalisation IS a background correction, quantile normalisation, and log [base 2] transformation. So, technically, they have not performed RMA (they only performed 2 steps of it).

ADD REPLYlink modified 4 days ago • written 4 days ago by Kevin Blighe43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 558 users visited in the last hour