Get the most expressed genes from one .CEL file in R.
1
2
Entering edit mode
9.6 years ago
alevax ▴ 20

Hi everybody!

LIMMA in R can give you a list of differentially expressed genes.

How can I simply get the probesets with highest signal intensity?

Can I get only the most expressed genes in an healty experiment, for example from one .CEL file?

Or the most expressed genes from a set of .CEL files of the same group (all of the control group, or all of the sample group).

Thanks! :)

Limma R Differential-Expression • 3.9k views
ADD COMMENT
0
Entering edit mode
source("http://www.bioconductor.org/biocLite.R")
biocLite(c("GEOquery","affy","limma","gcrma"))
gse_number <- "GSE13887"
getGEOSuppFiles( gse_number )
COMPRESSED_CELS_DIRECTORY <- gse_number
untar( paste( gse_number , paste( gse_number , "RAW.tar" , sep="_") , sep="/" ), exdir=COMPRESSED_CELS_DIRECTORY)
cels <- list.files( COMPRESSED_CELS_DIRECTORY , pattern = "[gz]")
sapply( paste( COMPRESSED_CELS_DIRECTORY , cels, sep="/") , gunzip )
celData <- ReadAffy( celfile.path = gse_number )

gcrma.ExpressionSet <- gcrma(celData)

if(!require(panp)) { biocLite("panp") }
library(panp)
myGDS <- getGEO("GDS2697")
eset <- GDS2eSet(myGDS,do.log2=TRUE)
my_pa <- pa.calls(eset)

is(eset)
is(gcrma.ExpressionSet)

If you run the script you'll get an error while executing:

my_pa <- pa.calls(eset)

and not while executing

my_pa <- pa.calls(gcrma.ExpressionSet)

Why if they are both ExpressionSet?

> is(gcrma.ExpressionSet)
[1] "ExpressionSet"    "eSet"             "VersionedBiobase" "Versioned"
> is(eset)
[1] "ExpressionSet"    "eSet"             "VersionedBiobase" "Versioned"
ADD REPLY
1
Entering edit mode

The GDS does not contain enough information to make PA calls. The PA calls need match and mismatch probe data and the GDS contains only normalized probeset data, likely derived from match probes only.

ADD REPLY
0
Entering edit mode
9.6 years ago

The order() function in R is perhaps a useful place to start. Note, though, that microarrays measure relative expression between samples and not absolute expression, so simply ordering by intensity is not really a quantitative way to get the most expressed genes (though qualitatively it might do).

ADD COMMENT
0
Entering edit mode

Ok, I understand.

So, how can Iget the absolute expression of one microarray sample?

ADD REPLY
1
Entering edit mode

Unfortunately, microarrays (and RNA-seq, for that matter) do not measure absolute expression, so you cannot.

ADD REPLY
0
Entering edit mode

Well, but if I take 2 .CEL files of the same group (e.g. 2 of the control group) and I perform the GCRMA obtaining the ExpessionSet object after the GCRMA, the gcrma.ExpressionSet, the pa.calls() works.

Thus, Should not I trust its result?

ADD REPLY
1
Entering edit mode

See these papers for some background on expressed/unexpressed gene calling:

ADD REPLY
0
Entering edit mode

Thank you. They were interesting readings.

ADD REPLY

Login before adding your answer.

Traffic: 2646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6