Question: How to analysis 2 color microarray data from GEO with limma?
0
gravatar for MatthewP
7 weeks ago by
MatthewP880
China
MatthewP880 wrote:

Hello, everyone. I want to download and analysis dataset GSE149940 with limma, but there are some small questions I want to ask even after I did read some materials.
I can get expression matrix with _GEOquery_.

gse <- getGEO(filename = matrixPath, destdir = sourceDir, getGPL = FALSE, AnnotGPL = FALSE)
expr1 <- exprs(gse)

I've read usersguide of _limma_ package which descripts how to parse 2 color chip data and how to construct design model for this "dye-swap" design data.
I want to know can I parse the expresion matrix expr1 extract with _GEOquery_ to limma to do analysis directly? Or I need to download rawdata from GEO, and parse to _limma_ with read.maimages function? The second way seems quiet complicated to me for I've never accessed any rawdata of microarray before.

By the way, the data processing descripted in GEO is:

Agilent Feature Extraction Software (v 8.5.1.1) was used for background subtraction and LOWESS normalization. Normalized log10 ratio (Cy3/Cy5) representing test/reference for samples 2301 R – 2354 S, 2351 R – 2309 S, 2317 R – 2314 S, 2343 R – 2284 S, 2343 R – 2284 S, 2358 R – 2355 S, and 2367 R – 2369 S; normalized log10 ratio (Cy5/Cy3) representing test/reference for samples 2354 S – 2301 R, 2309 S – 2351 R, 2314 S – 2317 R, 2284 S – 2343 R, 2355 S – 2358 R, and 2369 S – 2367 R

So this seems how expr1 were produced.

limma • 161 views
ADD COMMENTlink modified 6 weeks ago by Gordon Smyth2.3k • written 7 weeks ago by MatthewP880
4
gravatar for Gordon Smyth
6 weeks ago by
Gordon Smyth2.3k
Australia
Gordon Smyth2.3k wrote:

Yes, you can do either. You can analyse the matrix of normalized log-ratios that you get from GEO_query in limma. limma will accept the GEOquery object directly. Or you can read the raw data files into limma using read.images. Either way, the most important thing will be to setup the two-color design matrix appropriately.

The only limitation of using the GEO_query matrix is that you won't have access to A-values and hence you can't make MA plots or use the trend=TRUE option of eBayes.

Reading and normalizing the raw files is straightforward and gives full access to all limma capability:

> files
 [1] "GSM4518466_2301R_vs_2354S.txt.gz" "GSM4518467_2354S_vs_2301R.txt.gz"
 [3] "GSM4518468_2351R_vs_2309S.txt.gz" "GSM4518469_2309S_vs_2351R.txt.gz"
 [5] "GSM4518470_2317R_vs_2314S.txt.gz" "GSM4518471_2314S_vs_2317R.txt.gz"
 [7] "GSM4518472_2343R_vs_2284S.txt.gz" "GSM4518473_2284S_vs_2343R.txt.gz"
 [9] "GSM4518474_2358R_vs_2355S.txt.gz" "GSM4518475_2355S_vs_2358R.txt.gz"
[11] "GSM4518476_2367R_vs_2369S.txt.gz" "GSM4518477_2369S_vs_2367R.txt.gz"
> library(limma)
> RG <- read.maimages(files, source="agilent")
> RGb <- backgroundCorrect(RG, method="normexp")
> MA <- normalizeWithinArrays(RGb, method="loess")

This pipeline inter alia reads and collates the gene annotation from the raw files.

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Gordon Smyth2.3k

Thanks! The processed data seems to be log10 ratio, do I need to convert to log2 ratio for limma?

ADD REPLYlink written 6 weeks ago by MatthewP880

You can convert log2 if you want, it is optional. Makes no difference to the DE results (t-statistics, p-values, FDR etc). Only difference is that the logFC and AveExpr values will be on the log10 scale if you input log10.

ADD REPLYlink written 6 weeks ago by Gordon Smyth2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1628 users visited in the last hour
_