Question: Why Could I Only Get Two Different Expressed Genes By Limma?
0
gravatar for chenyongzi77
5.9 years ago by
chenyongzi770 wrote:

Hello everyone, I am new in microarray data analysis. recently I want to use Limma to find different expressed genes between grade I and grade III, which you could see from the annotation file. but when I use the R code below to do this, I only got two different expressed genes, could anybody tell me why? Is there anything wrong with my code? thank you in advance!

#the annotation file, I can't post the entire file because it is too long.
geo_accn    sample_name    treatment    dataset    grade

GSM65321    KIT_104B91    tamoxifen    KJX64    3

GSM65330    KIT_114B68    tamoxifen    KJX64    1

GSM65335    KIT_131B79    tamoxifen    KJX64    3

GSM65336    KIT_135B40    tamoxifen    KJX64    1

GSM65331    KIT_138B34    tamoxifen    KJX64    1

GSM65338    KIT_162B98    tamoxifen    KJX64    2

GSM65332    KIT_173B43    tamoxifen    KJX64    2



# the R code
rm(list = ls())

library(affy)

library(affycoretools)

library(simpleaffy)

library(gcrma, warn.conflicts = FALSE)

library(hgu133a.db)

library(a4)

datainfo <- read.table("GSE2990_suppl_new.txt", header = T, stringsAsFactors = FALSE)

data_CELnb <- datainfo[,1]

rawAffyData <- ReadAffy(filenames = paste(data_CELnb, ".CEL", sep = "")) 

pdata_lung <- datainfo

rownames(pdata_lung) <- sampleNames(rawAffyData)

metaDatalung <- data.frame(labelDescription =  colnames(pdata_lung))

phenoDataLung <- new("AnnotatedDataFrame", data = pdata_lung, varMetadata = metaDatalung)

phenoData(rawAffyData) <- phenoDataLung

eset<-rma(rawAffyData)

design <- model.matrix(~ -1 + factor(pData(eset)$grade))

colnames(design) <- c("Stage1", "Stage2","Stage3")

fit <- lmFit(eset, design)

contrast.matrix <- makeContrasts(Stage1-Stage3, levels = design)

fit2 <- contrasts.fit(fit, contrast.matrix)

fit2 <- eBayes(fit2)

fit2$genes$Symbol=getSYMBOL(fit2$genes$ID, "hgu133a")#insert symbol

fit2$genes$EG <- getEG(fit2$genes$ID, "hgu133a")# insert entrezid

fit2$genes$GeneName <- unlist(mget(fit2$genes$ID, hgu133aGENENAME))# insert the gene name

resultsLimmaCommon <- limma:::topTable(fit2, coef = 1, adjust.method = "fdr", number = nrow(eset),sort.by="P")

resultsLimmaCommon_Sig <- resultsLimmaCommon[abs(resultsLimmaCommon$logFC)> 2 & resultsLimmaCommon$adj.P.Val < 0.05, ]

summary(resultsLimmaCommon_Sig)
R limma microarray • 2.9k views
ADD COMMENTlink modified 5.9 years ago by Houkto210 • written 5.9 years ago by chenyongzi770

I would first check the P value distribution in the topTable... to see how the topTable looks like... to check the distribution

ADD REPLYlink written 5.9 years ago by k.nirmalraman980
0
gravatar for Houkto
5.9 years ago by
Houkto210
Houkto210 wrote:

Hi chenyongzi77,

First, is the data avaliable on public domain so I can have a go and then tell you if the results you got is true or not ? Second, I see you done a normalization with gcrma. can you do a regular rma and see what are the results. Third, results are the results maybe this is the only changes between what ever you measuring. I have seen experiment with no significant differentially expressed genes because the changes are not on the transcriptomics level maybe proteomic or metabolomic. lastly, do a plot of pre and post normalization of the CEL files to know if each CEL files corrospond to what ever sample you have and they are grouped correctly when you run LIMMA. Another thing check if there is another factor (technical) that might lead to a misleading results or masked a true biological differences such as scanning date, batch effect, reagents with principle component analysis (PCA).

ADD COMMENTlink written 5.9 years ago by Houkto210

The "geo_accn" column tells you that these data are public, in the GEO database. So you or anyone else can go there, search for these samples, identify the GEO series and run GEO2R at the website to confirm the findings.

ADD REPLYlink written 5.9 years ago by Neilfws48k

It is so nice of you to answer me this. thank you . and the data is available on line.http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2990. and the paper said they could find a lot of different expressed genes.

ADD REPLYlink written 5.9 years ago by chenyongzi770
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2054 users visited in the last hour