Sometimes you may wish to use a CDF file obtained from somewhere (besides the default ones provided through BioConductor). This tutorial is not about how to make such a CDF. But, merely about how to install it and use it with existing Bioconductor packages such as affy and rma.
#Create a package for downloaded CDF #This assumes you have already downloaded or created a CDF and you just want to use it in Bioconductor with (for example) ReadAffy #For example suppose you wanted to use a specific CDF from the Affy web site: #http://www.affymetrix.com/support/technical/byproduct.affx?product=hugene-1_0-st-v1 #Note, this example is purely for illustration. There is no real need to create a package for HuGene-1_0-st-v1 #This chip is already supported by BioConductor and can be loaded with #library(hugene10stv1cdf) #cdfname="hugene10stv1" #Install package for making cdf packages source("http://bioconductor.org/biocLite.R") biocLite("makecdfenv") library(makecdfenv) #Create CDF package in temporary directory pkgpath <- tempdir() make.cdf.package("HuGene-1_0-st-v1.r3.cdf", cdf.path="/Users/ogriffit/Downloads/HuGene-1_0-st-v1.r3.unsupported-cdf", compress=FALSE, species = "Homo_sapiens", package.path = pkgpath) dir(pkgpath) #Install that package at a terminal using 'pkgpath' from above #R CMD INSTALL /var/folders/8j/bqry255x52q6_dhyw22w6sq80000gn/T//Rtmppd9YEK/hugene10stv1.r3cdf #Then, load it for use here library(hugene10stv1.r3cdf) #Download a CEL file package for testing purposes getGEOSuppFiles("GSE27447") #Unpack the CEL files untar("GSE27447/GSE27447_RAW.tar", exdir="data") cels = list.files("data/", pattern = "CEL") sapply(paste("data", cels, sep="/"), gunzip) cels = list.files("data/", pattern = "CEL") setwd("/Users/ogriffit/data") raw.data=ReadAffy(verbose=TRUE, filenames=cels, cdfname="hugene10stv1.r3cdf") #Custom installed CDF #You can now go on to whatever normalizing and analysis you wish with the data using your custom CDF package #perform RMA normalization data.rma.norm=rma(raw.data) #Get the important stuff out of the data - the expression estimates for each array rma=exprs(data.rma.norm)