Question

Why Could I Only Get Two Different Expressed Genes By Limma?

0

Entering edit mode

11.0 years ago

chenyongzi77 • 0

Hello everyone, I am new in microarray data analysis. recently I want to use Limma to find different expressed genes between grade I and grade III, which you could see from the annotation file. but when I use the R code below to do this, I only got two different expressed genes, could anybody tell me why? Is there anything wrong with my code? thank you in advance!

#the annotation file, I can't post the entire file because it is too long.
geo_accn    sample_name    treatment    dataset    grade

GSM65321    KIT_104B91    tamoxifen    KJX64    3

GSM65330    KIT_114B68    tamoxifen    KJX64    1

GSM65335    KIT_131B79    tamoxifen    KJX64    3

GSM65336    KIT_135B40    tamoxifen    KJX64    1

GSM65331    KIT_138B34    tamoxifen    KJX64    1

GSM65338    KIT_162B98    tamoxifen    KJX64    2

GSM65332    KIT_173B43    tamoxifen    KJX64    2



# the R code
rm(list = ls())

library(affy)

library(affycoretools)

library(simpleaffy)

library(gcrma, warn.conflicts = FALSE)

library(hgu133a.db)

library(a4)

datainfo <- read.table("GSE2990_suppl_new.txt", header = T, stringsAsFactors = FALSE)

data_CELnb <- datainfo[,1]

rawAffyData <- ReadAffy(filenames = paste(data_CELnb, ".CEL", sep = "")) 

pdata_lung <- datainfo

rownames(pdata_lung) <- sampleNames(rawAffyData)

metaDatalung <- data.frame(labelDescription =  colnames(pdata_lung))

phenoDataLung <- new("AnnotatedDataFrame", data = pdata_lung, varMetadata = metaDatalung)

phenoData(rawAffyData) <- phenoDataLung

eset<-rma(rawAffyData)

design <- model.matrix(~ -1 + factor(pData(eset)$grade))

colnames(design) <- c("Stage1", "Stage2","Stage3")

fit <- lmFit(eset, design)

contrast.matrix <- makeContrasts(Stage1-Stage3, levels = design)

fit2 <- contrasts.fit(fit, contrast.matrix)

fit2 <- eBayes(fit2)

fit2$genes$Symbol=getSYMBOL(fit2$genes$ID, "hgu133a")#insert symbol

fit2$genes$EG <- getEG(fit2$genes$ID, "hgu133a")# insert entrezid

fit2$genes$GeneName <- unlist(mget(fit2$genes$ID, hgu133aGENENAME))# insert the gene name

resultsLimmaCommon <- limma:::topTable(fit2, coef = 1, adjust.method = "fdr", number = nrow(eset),sort.by="P")

resultsLimmaCommon_Sig <- resultsLimmaCommon[abs(resultsLimmaCommon$logFC)> 2 & resultsLimmaCommon$adj.P.Val < 0.05, ]

summary(resultsLimmaCommon_Sig)

limma r microarray • 4.6k views

ADD COMMENT • link updated 11.0 years ago by Houkto ▴ 220 • written 11.0 years ago by chenyongzi77 • 0

0

Entering edit mode

I would first check the P value distribution in the topTable... to see how the topTable looks like... to check the distribution

ADD REPLY • link 11.0 years ago by k.nirmalraman ★ 1.1k

score 0 · Answer 1 · 2013-05-31

0

Entering edit mode

11.0 years ago

Houkto ▴ 220

Hi chenyongzi77,

First, is the data avaliable on public domain so I can have a go and then tell you if the results you got is true or not ? Second, I see you done a normalization with gcrma. can you do a regular rma and see what are the results. Third, results are the results maybe this is the only changes between what ever you measuring. I have seen experiment with no significant differentially expressed genes because the changes are not on the transcriptomics level maybe proteomic or metabolomic. lastly, do a plot of pre and post normalization of the CEL files to know if each CEL files corrospond to what ever sample you have and they are grouped correctly when you run LIMMA. Another thing check if there is another factor (technical) that might lead to a misleading results or masked a true biological differences such as scanning date, batch effect, reagents with principle component analysis (PCA).

ADD COMMENT • link 11.0 years ago by Houkto ▴ 220

0

Entering edit mode

The "geo_accn" column tells you that these data are public, in the GEO database. So you or anyone else can go there, search for these samples, identify the GEO series and run GEO2R at the website to confirm the findings.

ADD REPLY • link 11.0 years ago by Neilfws 49k

0

Entering edit mode

It is so nice of you to answer me this. thank you . and the data is available on line.http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2990. and the paper said they could find a lot of different expressed genes.

ADD REPLY • link 11.0 years ago by chenyongzi77 • 0