Question: Making An Expression Matrix
1
gravatar for moranr
7.8 years ago by
moranr270
Ireland
moranr270 wrote:

Hi,

I have 5 data series(GSE). 4 from GEO database, 2 of which have the raw CEL files. I want to get all of this information into a single data matrix for analysis, using the RAW data where possible. I am very new to this whole area and it is proving difficult. Can anyone offer any help on this?

I can get a series into an expression set when using series matrix files via gsexxxx= GEOquery("GSExxxx", GSEMAtrix=TRUE)- i think this is correct ?! I think I can also get all cel files , normalise and into an expression set using ReadAffy function and gcRMA?

Any Help much appreciated, sorry if this question doesnt even make sense, as I said, I'm very new to it all!

Thanks, Ray

R bioconductor microarray • 3.6k views
ADD COMMENTlink modified 6.9 years ago by Biostar ♦♦ 20 • written 7.8 years ago by moranr270

Are your datasets all from the same platform? That will affect if/how you combine them.

ADD REPLYlink written 7.8 years ago by Obi Griffith18k

Yes the same platform is being use as I think it will give a powerful output

ADD REPLYlink written 7.8 years ago by moranr270

If you have 5 GSE series and 4 are from GEO, where does the fifth come from? So far as I know, all GSE come from GEO.

ADD REPLYlink written 7.8 years ago by Neilfws49k

Oh sorry the 5th is not a GSE, it is just similar, it comes from CA express, as a downloadable files with supplementary files.

ADD REPLYlink written 7.8 years ago by moranr270
4
gravatar for Sean Davis
7.8 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

It sounds like you understand the details of getting data from GEO and taking .CEL files to an ExpressionSet. Where things are going to get complicated is in getting "all of this information into a single data matrix for analysis". Doing so may not be the best approach, but it is impossible to know without a good deal more background, a level of background that is not easily communicated in a forum or email. Since you say you are relatively new to the whole area, I suggest you find a local bioinformatics collaborator who can work through the data with you.

ADD COMMENTlink written 7.8 years ago by Sean Davis26k

Hi,

I'm stock in getting all the information into a single data matrix. Would you please help me how I can do that?

Thanks.

ADD REPLYlink written 5.2 years ago by Parisa0

Perhaps you could ask a new question and give the details of where you are getting stuck.

ADD REPLYlink written 5.2 years ago by Sean Davis26k
3
gravatar for Obi Griffith
7.8 years ago by
Obi Griffith18k
Washington University, St Louis, USA
Obi Griffith18k wrote:

If it was me, I would try hard to get CEL files for all of them. If not in GEO you might try requesting directly from the author. I've had about 50% success with this in the past. Then with all CEL files, you can use affy, gcrma, and custom cdf to create one consistently summarized and normalized dataset mapped to gene symbols (Retrieving Probe To Gene Ids For Affymetrix Chips In Bioconductor). The latter would allow you to compare with any datasets at gene level where you don't have raw CEL files. Whether you are able to process all together or you try to combine differently processed data after the fact I would be VERY aware of the potential for batch effects.

ADD COMMENTlink written 7.8 years ago by Obi Griffith18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1952 users visited in the last hour