I wonder if anyone is familiar with the Bioconductor RankProduct package for ranking and obtaining differentiallly expressed genes. Some info about the software are as follows:
http://bioinformatics.oxfordjournals.org/content/22/22/2825.full http://www.bioconductor.org/packages/release/bioc/manuals/RankProd/man/RankProd.pdf http://www.bioconductor.org/packages/2.11/bioc/vignettes/RankProd/inst/doc/RankProd.pdf
I ran into some problems while using the program, maybe because of my little knowledge of R language. I tried to replicate the steps in the pdf files above with my own data. Although my own datasets were not in the afffy .cel files as in the examples, but only as rows and columns in a tab-delimited file. I have two conditions (1 and 2, replicate = 4 for each)
My codes are as follows:
library(RankProd) library(preprocessCore) #Read expression data #gdata <- read.table(file="data2.txt", sep="\t", header=T) #9000 rows of genes X 8 columns of chips gdata <- read.table(file="data2.txt", sep="\t", header=T, row.names=1) #9000 rows of genes X 8 columns of chips #colnames(gdata) # This vector contains the microarray sample names SampleNames= names(data.frame(gdata[,-1])) #names(datExpr)=gdata[,1] # This vector contains the gene names datExpr.gnames= gdata$GeneName # Since the first column contains the gene names, exclude it. # dataExp is then the matix required datExpr=data.frame(gdata[,-1]) #convert data into matrix form datExpr <- as.matrix(datExpr) #data normalization - quantile normalization #datExpr.log.norm <- normalize.quantiles((log2(datExpr)),copy=TRUE) #with logged data datExpr <- datExpr.log.norm #datExpr.norm <- normalize.quantiles(datExpr,copy=TRUE) #without logged data #datExpr <- datExpr.norm # Identify two class data - control/treatment (or condition 1/condition2) nl <- 4 n2 <- 4 cl <- rep(c(0,1), c(nl, n2)) datExpr.cl <- cl # data were generated under identical or very similar conditions except the # factor of interest (e.g., control and treatment), origin <- rep(1, nl + n2) datExpr.origin <- origin # Data anslysis datExpr.sub <- datExpr[,which(datExpr.origin == 1)] datExpr.cl.sub <- datExpr.cl[which(datExpr.origin == 1)] datExpr.origin.sub <- datExpr.origin[which(datExpr.origin == 1)] #Rank product analysis and output #RP.out <- RP(datExpr.sub, datExpr.cl.sub, num.perm = 100, logged = TRUE,na.rm = FALSE, plot = FALSE, rand = 123) RP.out <- RPadvance(datExpr.sub, datExpr.cl.sub, datExpr.origin.sub, num.perm = 100,logged = TRUE, na.rm = FALSE, gene.names = datExpr.gnames, plot = FALSE,rand = 123) # Output a table of the identified genes based on user-specified selection criteria topGene(RP.out, cutoff = 0.05, method = "pfp", logged = TRUE,logbase = 2, gene.names = datExpr.gnames)
I did run the code, but my fold changes for differentially expressed genes in one condition VS the other were either 0's or infinities. I wonder if anyone with experience with this program can help me.