Microarray meta analysis with SOFT datasets
1
1
Entering edit mode
8.5 years ago
chenyangls ▴ 30

I found when doing microarray meta analysis, people usually download the CEL data from databases like GEO, normalize the raw data with R packages such as affy, and then use R packages like MetaDE to complete the meta analysis with the normalized datasets. But I found many datasets in GEO are in the SOFT format, and these data are already normalized. Can I directly import SOFT datasets into MetaDE and do meta analysis?

R • 4.1k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
Martombo ★ 3.0k

You can use the package GEOquery. the function getGEO(filename="data.soft.gz") can read your data into an expression set object. have a look at http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/geo/

ADD COMMENT
0
Entering edit mode

Thank you for the reply! So is it validate to directly use the processed soft data or do I need to go back to the cel data and normalize different datasets in one particular way?

ADD REPLY
0
Entering edit mode

There is no "right" answer to this question since there is no standard for normalization and not all datasets even supply raw data.

Just a comment, getGEO() will take an accession. There is no need to download soft files separately.

ADD REPLY
0
Entering edit mode

Thanks! I'm just curious, since there is no standard way for normalization, why do people choose to normalize their own way rather than trusting the original data curators. It adds workload but can be even less reliable.

ADD REPLY
0
Entering edit mode

Starting with raw data allows more thorough quality control and also ensures that the normalization was done appropriately (rather than taking the word of the original submitters). Not everyone does renormalize and not all data on GEO actually have enough raw data to perform an appropriate normalization.

ADD REPLY
0
Entering edit mode

Thanks for the comments. I just tried importing GDS402 with GEOquery, but got this error: "cannot open connection," but I can download this file from the website. I think I've ruled out internet connection problems, since I can import other datasets, such as GDS507, with GEOquery.

I used the command:

gds <- getGEO(filename=system.file("extdata/GDS402.soft.gz",package="GEOquery"))

because when using getGEO(), I got this error message:

"cannot open destfile 'C:\Users\...\AppData\Local\Temp\RtmpiO3KxZ/GDS402.soft.gz'"

I guess it has something to do with the windows OS.

So did I do anything wrong when using GEOquery or did the FTP of GEO change their url?

ADD REPLY
0
Entering edit mode

All you need is:

gds = getGEO('GDS402')
ADD REPLY
0
Entering edit mode

Just tried gds = getGEO('GDS402'), still have the error:

cannot open destfile 'C:\Users\...\AppData\Local\Temp\RtmpiO3KxZ/GDS402.soft.gz', reason 'No such file or directory'
ADD REPLY
0
Entering edit mode

Please start a new R session, load the GEOquery library, run the line of code above, and paste in any error message along with the output of sessionInfo().

ADD REPLY
0
Entering edit mode

It worked after starting a new R session. Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 1048 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6