How to analysis 2 color microarray data from GEO with limma?
1
0
Entering edit mode
3.3 years ago
MatthewP ★ 1.4k

Hello, everyone. I want to download and analysis dataset GSE149940 with limma, but there are some small questions I want to ask even after I did read some materials.
I can get expression matrix with _GEOquery_.

gse <- getGEO(filename = matrixPath, destdir = sourceDir, getGPL = FALSE, AnnotGPL = FALSE)
expr1 <- exprs(gse)

I've read usersguide of _limma_ package which descripts how to parse 2 color chip data and how to construct design model for this "dye-swap" design data.
I want to know can I parse the expresion matrix expr1 extract with _GEOquery_ to limma to do analysis directly? Or I need to download rawdata from GEO, and parse to _limma_ with read.maimages function? The second way seems quiet complicated to me for I've never accessed any rawdata of microarray before.

By the way, the data processing descripted in GEO is:

Agilent Feature Extraction Software (v 8.5.1.1) was used for background subtraction and LOWESS normalization. Normalized log10 ratio (Cy3/Cy5) representing test/reference for samples 2301 R – 2354 S, 2351 R – 2309 S, 2317 R – 2314 S, 2343 R – 2284 S, 2343 R – 2284 S, 2358 R – 2355 S, and 2367 R – 2369 S; normalized log10 ratio (Cy5/Cy3) representing test/reference for samples 2354 S – 2301 R, 2309 S – 2351 R, 2314 S – 2317 R, 2284 S – 2343 R, 2355 S – 2358 R, and 2369 S – 2367 R

So this seems how expr1 were produced.

limma • 1.2k views
ADD COMMENT
4
Entering edit mode
3.3 years ago
Gordon Smyth ★ 7.0k

Yes, you can do either. You can analyse the matrix of normalized log-ratios that you get from GEO_query in limma. limma will accept the GEOquery object directly. Or you can read the raw data files into limma using read.images. Either way, the most important thing will be to setup the two-color design matrix appropriately.

The only limitation of using the GEO_query matrix is that you won't have access to A-values and hence you can't make MA plots or use the trend=TRUE option of eBayes.

Reading and normalizing the raw files is straightforward and gives full access to all limma capability:

> files
 [1] "GSM4518466_2301R_vs_2354S.txt.gz" "GSM4518467_2354S_vs_2301R.txt.gz"
 [3] "GSM4518468_2351R_vs_2309S.txt.gz" "GSM4518469_2309S_vs_2351R.txt.gz"
 [5] "GSM4518470_2317R_vs_2314S.txt.gz" "GSM4518471_2314S_vs_2317R.txt.gz"
 [7] "GSM4518472_2343R_vs_2284S.txt.gz" "GSM4518473_2284S_vs_2343R.txt.gz"
 [9] "GSM4518474_2358R_vs_2355S.txt.gz" "GSM4518475_2355S_vs_2358R.txt.gz"
[11] "GSM4518476_2367R_vs_2369S.txt.gz" "GSM4518477_2369S_vs_2367R.txt.gz"
> library(limma)
> RG <- read.maimages(files, source="agilent")
> RGb <- backgroundCorrect(RG, method="normexp")
> MA <- normalizeWithinArrays(RGb, method="loess")

This pipeline inter alia reads and collates the gene annotation from the raw files.

ADD COMMENT
0
Entering edit mode

Thanks! The processed data seems to be log10 ratio, do I need to convert to log2 ratio for limma?

ADD REPLY
0
Entering edit mode

You can convert log2 if you want, it is optional. Makes no difference to the DE results (t-statistics, p-values, FDR etc). Only difference is that the logFC and AveExpr values will be on the log10 scale if you input log10.

ADD REPLY

Login before adding your answer.

Traffic: 3395 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6