Question: Microarray Meta-Analysis
5
gravatar for Nasir
7.6 years ago by
Nasir50
Nasir50 wrote:

I would be very grateful for suggestions on how best to tackle this project. I want to find out which SLC transporter transcripts are most highly expressed in the normal human hippocampus. For this, I plan to use publicly available Affymetrix U133 Plus 2.0 microarray data from ArrayExpress/GEO. I will utilize the CEL files or normalized data for just the normal/control tissues. How can I combine data from different experiments/studies (all done on the U133 Plus 2.0 platform) to get the most reliable estimate and hierarchical list of transcipt abundance in normal human hippocampus? Thank you!

meta microarray affymetrix • 3.1k views
ADD COMMENTlink modified 7.6 years ago by Qdjm1.9k • written 7.6 years ago by Nasir50
4
gravatar for Michael Dondrup
7.6 years ago by
Bergen, Norway
Michael Dondrup45k wrote:

The most important aspect for your analysis is possibly to run the normalization and probe summarization again from the CEL files. Reasons for this are twofold: First, to use a consistent array-design description for all arrays during pre-processing, second, many normalization methods (eg. quantile-normalization) tend to scale the arrays in the context of all chips in the experiment. If you take the arrays out of context, and put them into a new one, the absolute values become meaningless. Thus I would recommend to collect all CEL files into a 'virtual experiment' and run normalization, summarization on them using the latest array description file (.adf).

ADD COMMENTlink written 7.6 years ago by Michael Dondrup45k
3
gravatar for Chris Evelo
7.6 years ago by
Chris Evelo9.9k
Maastricht, The Netherlands
Chris Evelo9.9k wrote:

Please check arrayexpress atlas: http://www.ebi.ac.uk/gxa/

It is a curated subset of arrayexpress where the curators think the studies are useful for the kind of comparisons you want to do.

If I remember correctly it also already provides re-normalized data using RMA, to make data as comparable as can be. But you will almost certainly need a statistical modelling approach that includes studies as a factor.

ADD COMMENTlink written 7.6 years ago by Chris Evelo9.9k
1
gravatar for Qdjm
7.6 years ago by
Qdjm1.9k
Toronto
Qdjm1.9k wrote:

If all you need is a rank ordering of SLC transporter transcripts, you could try sorting the expression levels in each array separately and then replacing each expression level with its sort order in the array. Then your "expression level" for each gene would be its median (or mean) rank (i.e. sort order) across all of the hippocampal arrays.

The advantage of this approach is that you need not worry about making all the measurements comparable. If you have enough samples, I bet you'll get virtually the same answer as a re-normalization approach.

In addition to Michael and Chris' suggestions, you might also need to run ComBat.R to combine data from different labs together. See the answers to this question on combining gene expression from multiple arrays.

ADD COMMENTlink written 7.6 years ago by Qdjm1.9k

Thank you all for your insightful comments. I do just need a rank ordering of transcripts. Hence, the simplicity of Quaid's approach is very appealing. Do you think anything additional will be gained by putting the individual array data into RankProd?

ADD REPLYlink written 7.5 years ago by Nasir50

Can't say, never used RankProd. But if it's a way to determine confidence intervals using ranks, it sounds like it would be a good thing to do.

ADD REPLYlink written 7.5 years ago by Qdjm1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1876 users visited in the last hour