Tutorial:sharing some naive codes for microarray normalization in R with whom are too new in R alike me
0
6
Entering edit mode
5.9 years ago
A ★ 4.0k

hi,

for sure this is not perfect...hope it helps

I am working with Arabidopsis thaliana

for RMA normalization

library(affy)

# To read all CEL files in the working directory:

eset <- rma(Data)
norm.data <- exprs(eset)

# The norm.data R object contains the normalized expression for every probeset in the ATH1 microarrays used in this example. In order to convert the probeset IDs to Arabidopsis gene identifiers, the file ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH1_array_elements-2010-12-20.txt download from the TAIR database and place in the folder with the microarray data. In order to avoid ambiguous probeset associations (i.e. probesets that have multiple matches to genes), we only used probes that match only one gene in the Arabidopsis genome.

# Select the columns that contain the probeset ID and corresponding AGI number. Please note that the positions used to index the matrix depend on the input format of the array elements file. You can change these numbers to index the corresponding columns if you are using a different format:
probe_agi <- as.matrix(affy_names[,c(1,5)])

# To associate the probeset with the corresponding AGI locus:
normalized.names <-merge(probe_agi,norm.data,by.x=1,by.y=0)[,-1]

# To remove probesets that do not match the Arabidopsis genome:
normalized.arabidopsis <- normalized.names[grep("AT",normalized.names[,1]),]

# To remove ambiguous probes:
normalized.arabidopsis.unambiguous <- normalized.arabidopsis[grep(pattern="",normalized.arabidopsis[,1], invert=T),]

# In some cases, multiple probes match the same gene, due to updates in the annotation of the genome. To remove duplicated genes in the matrix:
normalized.agi.final <- normalized.arabidopsis.unambiguous[!duplicated(normalized.arabidopsis.unambiguous[,1]),]

# To assign the AGI number as row name:
rownames(normalized.agi.final) <- normalized.agi.final[,1]
normalized.agi.final <- normalized.agi.final[,-1]

#The resulting gene expression dataset contains unique row identifies (i.e. AGI locus), and different expression values obtained from different experiments on each column

# To export this data matrix from R to a tab-delimited file use the following command. The file will be written to the folder that you set up as your working directory in R using the setwd() command in line 1 above:
write.table (normalized.agi.final,"RMA.txt", sep="\t",col.names=NA,quote=F)


for VSN and gcrma normalization except this part the rest is the same

vsn normalization

library (affy)

library (vsn)

eset  <- expresso(Data, normalize.method="vsn", bg.correct=F, pmcorrect.method="pmonly", summary.method="medianpolish")

norm.data  <-  exprs(eset)


for gcrma

library (affy)

library (gcrma)

eset <- gcrma(Data)<br>
norm.data <- exprs(eset)

for Illumina HumanHT-12 V3.0 expression beadchip

library(affy)

library(limma)

library(GEOquery)

# Set GEO dataset

G <- getGEO("GSE3053",GSEMatrix=T)

#Get the ExpressionSet object

eset <- G[[1]]

#Normalization: may not be necessary as GEO datasets should be pre-processed.

#library(affyPLM)

#eset.n <- normalize.ExpressionSet.quantiles(eset)

# See the following to check if the dataset appears to be normalized.

e <- exprs(eset)

write.file(e,,,)


And finally a link for Agilent microarray normalization

http://matticklab.com/index.php?title=Single_channel_analysis_of_Agilent_microarray_data_with_Limma

R tutorial Tutorial • 4.0k views
1
Entering edit mode

Also useful for beginners like me. Thanks for sharing.

1
Entering edit mode

Hello, thank you for sharing ! But you should post this as tutorial, not question, and you could format your code like this to make it easier to read :

library (affy)

0
Entering edit mode

thank you friends, actually when remembering i passed hardship time when i was going to learn normalization :)

1
Entering edit mode

@F: I reformatted your post to make it more readable. Take a look and confirm things look ok. Post type was also changed to tutorial to correctly reflect the content.

1
Entering edit mode

thank you so much, really looks much more readable. today I was googling for Illumina HumanHT-12 V3.0 expression beadchip data normalization, I thought share what I learned for students beginner in R.

1
Entering edit mode

I did some further modifications, some assignment arrows were converted to &lt;- and quotation marks were changed too.