Question

Tutorial:sharing some naive codes for microarray normalization in R with whom are too new in R alike me

7

Entering edit mode

8.2 years ago

zizigolu ★ 4.3k

Hi,

For sure this is not perfect. Hope it helps

I am working with Arabidopsis thaliana

For RMA normalization

library(affy)

# To read all CEL files in the working directory:
Data <- ReadAffy()

eset <- rma(Data)
norm.data <- exprs(eset)

# The norm.data R object contains the normalized expression for every probeset in the ATH1 microarrays used in this example. In order to convert the probeset IDs to Arabidopsis gene identifiers, the file ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH1_array_elements-2010-12-20.txt download from the TAIR database and place in the folder with the microarray data. In order to avoid ambiguous probeset associations (i.e. probesets that have multiple matches to genes), we only used probes that match only one gene in the Arabidopsis genome.
affy_names <- read.delim("affy_ATH1_array_elements-2010-12-20.txt",header=T)

# Select the columns that contain the probeset ID and corresponding AGI number. Please note that the positions used to index the matrix depend on the input format of the array elements file. You can change these numbers to index the corresponding columns if you are using a different format:
probe_agi <- as.matrix(affy_names[,c(1,5)])

# To associate the probeset with the corresponding AGI locus:
normalized.names <-merge(probe_agi,norm.data,by.x=1,by.y=0)[,-1]

# To remove probesets that do not match the Arabidopsis genome:
normalized.arabidopsis <- normalized.names[grep("AT",normalized.names[,1]),]

# To remove ambiguous probes:
normalized.arabidopsis.unambiguous <- normalized.arabidopsis[grep(pattern="",normalized.arabidopsis[,1], invert=T),]

# In some cases, multiple probes match the same gene, due to updates in the annotation of the genome. To remove duplicated genes in the matrix:
normalized.agi.final <- normalized.arabidopsis.unambiguous[!duplicated(normalized.arabidopsis.unambiguous[,1]),]

# To assign the AGI number as row name:
rownames(normalized.agi.final) <- normalized.agi.final[,1]
normalized.agi.final <- normalized.agi.final[,-1]

#The resulting gene expression dataset contains unique row identifies (i.e. AGI locus), and different expression values obtained from different experiments on each column

# To export this data matrix from R to a tab-delimited file use the following command. The file will be written to the folder that you set up as your working directory in R using the setwd() command in line 1 above:
write.table (normalized.agi.final,"RMA.txt", sep="\t",col.names=NA,quote=F)

for VSN and gcrma normalization except this part the rest is the same

vsn normalization

library (affy)
library (vsn)

Data<-ReadAffy()

eset  <- expresso(Data, normalize.method="vsn", bg.correct=F, pmcorrect.method="pmonly", summary.method="medianpolish")

norm.data  <-  exprs(eset)

for gcrma

library (affy)
library (gcrma)

eset <- gcrma(Data)
norm.data <- exprs(eset)

for Illumina HumanHT-12 V3.0 expression beadchip

library(affy)
library(limma)
library(GEOquery)

# Set GEO dataset
G <- getGEO("GSE3053",GSEMatrix=T) 

#Get the ExpressionSet object
eset <- G[[1]]

#Normalization: may not be necessary as GEO datasets should be pre-processed. 
#library(affyPLM)
#eset.n <- normalize.ExpressionSet.quantiles(eset)

# See the following to check if the dataset appears to be normalized.
e <- exprs(eset)
write.file(e,,,)

And finally a link for Agilent microarray normalization

http://matticklab.com/index.php?title=Single_channel_analysis_of_Agilent_microarray_data_with_Limma

R • 5.0k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.2 years ago by zizigolu ★ 4.3k

1

Entering edit mode

Also useful for beginners like me. Thanks for sharing.

ADD REPLY • link 7.7 years ago by JackieMe ▴ 30

1

Entering edit mode

Hello, thank you for sharing ! But you should post this as tutorial, not question, and you could format your code like this to make it easier to read :

library (affy)

ADD REPLY • link 7.7 years ago by Carlo Yague 8.6k

0

Entering edit mode

thank you friends, actually when remembering i passed hardship time when i was going to learn normalization :)

ADD REPLY • link 7.7 years ago by zizigolu ★ 4.3k

1

Entering edit mode

@F: I reformatted your post to make it more readable. Take a look and confirm things look ok. Post type was also changed to tutorial to correctly reflect the content.

ADD REPLY • link 7.1 years ago by GenoMax 141k

1

Entering edit mode

thank you so much, really looks much more readable. today I was googling for Illumina HumanHT-12 V3.0 expression beadchip data normalization, I thought share what I learned for students beginner in R.

ADD REPLY • link 7.1 years ago by zizigolu ★ 4.3k

1

Entering edit mode

I did some further modifications, some assignment arrows were converted to <- and quotation marks were changed too.

ADD REPLY • link 7.0 years ago by WouterDeCoster 47k