I am trying to obtain normalized expression values for Affymetrix microarray data that we have using bioconductor. The script I have is mostly there, however, I am irked that the row header describes the samples by the file names of the CEL files, rather than the sample names designated in Targets.txt.
Here is an excerpt of my Targets.txt file:
# file for use by limma and affylmGUI. the targets are per-condition and per-time-point.
Name FileName Target
feh.rep1 FH1.CEL feh
feh.rep2 FH2.CEL feh
feh.rep3 FH3.CEL feh
...
Here is the R script:
library(affy)
library(gcrma)
# phenotype data
pd <- read.AnnotatedDataFrame("Targets.txt", header=T)
affy.data <- ReadAffy(filenames=pd$FileName, phenoData=pd)
# gene expression data, normalized by GCRMA
eset <- gcrma(affy.data)
write.exprs(eset, "cs_hm_feh-expression-gcrma-2011-01-12.tsv", sep="\t", row)
I have tried using
affy.data <- ReadAffy(filenames=pd$FileName, phenoData=pd, sampleNames=row.names(pd))
however, that was unsuccessful. Further investigation shows that row.names isn't actually getting anything at all
> row.names(pd)
NULL
which I find perplexing, given that the object shows it has row names, which are exactly what I want (and expected) as my sample labels in the final CSV table:
> pd
An object of class "AnnotatedDataFrame"
rowNames: feh.rep1, feh.rep2, ..., cs.8d.rep3 (27 total)
varLabels and varMetadata description:
FileName:
Target:
Any help is appreciated, as I wield R fairly ignorantly and can not figure my way through this one seemingly simple task.
Thanks, Brad! Excellent suggestion reading the docs. I was reading the wrong ones (for
ReadAffyandread.AnnotatedDataFrame). Worked perfectly usingsampleNames=sampleNames(pd). I'm still surprised I had to manually specify this, but I probably made the wrong assumptions and/or abused the method calls.Chris, agreed that it seems like it should work without manually specifying it; that's what made me so unsure I was answering your question correctly. Glad that it worked.