Entering edit mode
11.5 years ago
Obi Griffith
20k
Sometimes you may wish to use a CDF file obtained from somewhere (besides the default ones provided through BioConductor). This tutorial is not about how to make such a CDF. But, merely about how to install it and use it with existing Bioconductor packages such as affy and rma.
#Create a package for downloaded CDF
#This assumes you have already downloaded or created a CDF and you just want to use it in Bioconductor with (for example) ReadAffy
#For example suppose you wanted to use a specific CDF from the Affy web site:
#http://www.affymetrix.com/support/technical/byproduct.affx?product=hugene-1_0-st-v1
#Note, this example is purely for illustration. There is no real need to create a package for HuGene-1_0-st-v1
#This chip is already supported by BioConductor and can be loaded with
#library(hugene10stv1cdf) #cdfname="hugene10stv1"
#Install package for making cdf packages
source("http://bioconductor.org/biocLite.R")
biocLite("makecdfenv")
library(makecdfenv)
#Create CDF package in temporary directory
pkgpath <- tempdir()
make.cdf.package("HuGene-1_0-st-v1.r3.cdf", cdf.path="/Users/ogriffit/Downloads/HuGene-1_0-st-v1.r3.unsupported-cdf", compress=FALSE, species = "Homo_sapiens", package.path = pkgpath)
dir(pkgpath)
#Install that package at a terminal using 'pkgpath' from above
#R CMD INSTALL /var/folders/8j/bqry255x52q6_dhyw22w6sq80000gn/T//Rtmppd9YEK/hugene10stv1.r3cdf
#Then, load it for use here
library(hugene10stv1.r3cdf)
#Download a CEL file package for testing purposes
getGEOSuppFiles("GSE27447")
#Unpack the CEL files
untar("GSE27447/GSE27447_RAW.tar", exdir="data")
cels = list.files("data/", pattern = "CEL")
sapply(paste("data", cels, sep="/"), gunzip)
cels = list.files("data/", pattern = "CEL")
setwd("/Users/ogriffit/data")
raw.data=ReadAffy(verbose=TRUE, filenames=cels, cdfname="hugene10stv1.r3cdf") #Custom installed CDF
#You can now go on to whatever normalizing and analysis you wish with the data using your custom CDF package
#perform RMA normalization
data.rma.norm=rma(raw.data)
#Get the important stuff out of the data - the expression estimates for each array
rma=exprs(data.rma.norm)
I think one more bioconductor package that also needs to be installed is `s4vector` http://www.bioconductor.org/packages/release/bioc/html/S4Vectors.html
i was trying to add this to comment instead #quickfingers.
Hi,
I tried to follow the example using the file GPL8715_Hs133P_Hs_UG_8.cdf. But when I try to normalize using the rma method, I get the next error:
Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain GPL8715Hs133PHsUG8cdf Library - package gpl8715hs133phsug8cdf not installed Bioconductor - gpl8715hs133phsug8cdf not available In addition: Warning message: missing cdf environment! in show(AffyBatch)
In the previous steps, I dind't get any error message and I can load the new created package even.
How can I fix it?
Best regards,
Juan.
Start by opening a new post with your question. Please do not use the
ANSWER
box to ask questions.