Question: Comparing the gene expression data from the experiments performed using same platform
gravatar for Natasha
5 months ago by
Natasha30 wrote:

Hi All, Recently, I was going through some of the posts on how to compare the gene expression data, obtained from microarray experiments, reported in various studies. In most of the answers found on this forum, it was suggested by many to use RankProd.

I have collected data from a set of 10 studies performed on the same platform. I would like to ask for suggestions on what could be the best approach that is recommended for comparing datasets from different studies, same platform?

microarray gene expression • 229 views
ADD COMMENTlink modified 5 months ago by Kevin Blighe37k • written 5 months ago by Natasha30
gravatar for Kevin Blighe
5 months ago by
Kevin Blighe37k
Republic of Ireland
Kevin Blighe37k wrote:

You should just process them all together but include 'experiment' (or study) as a covariate when fitting your linear model with, presumably, limma.

RankProd is designed for meta-analysis. Are you doing a meta-analysis or do you just want to have a single experiment with a combined sample n that is larger than any individual sample ns?


ADD COMMENTlink modified 5 months ago • written 5 months ago by Kevin Blighe37k

Thanks a lot for the advise. I am sightly confused, my understanding is combining data sets for further analysis is meta-analysis.I'm not sure about the clear distinction. Are these different?

Explaining a bit more on my end goal, What I want to do is the following, Considering the data from two studies Study 1 :20 samples (Classification of cell type,given in GSE20966, - 10 samples labelled "Control", 10 samples labelled "Type 2 Diabetic") Study 2 :90 samples (Classification of cell type, given in GSE53454,- 45 samples labelled 'Control",45 labelled "IFN-beta and IFN-gamma") I want to combine the data from these two studies(for my real case, I have 8 more studies) and find an average of the absolute value of the gene expression of 30 genes of my interest, with two classification (i) control and (ii) diabetic.

Since the classification of cell type varies from study to study ,I am not sure how this can be achieved. Could you please suggest what could be the right way to proceed?

I had a chance to look at the documentation of the RankProd package, a couple of days ago. From what I understand, the end result of the analysis performed using RandProd is a quantitative measure of the ranks of genes that is obtained after comparing the fold change ratio across samples and experiments.

I'm still trying to get a hold of how limma package works

Many thanks for the tremendous support

ADD REPLYlink modified 5 months ago • written 5 months ago by Natasha30

The distinction between a meta analysis and simply merging all of your raw data together comes as follows:

  1. a meta analysis looks at the results of each independent study and will determine those that corroborate (or not). Each study is performed independently and it's just the results that are compared
  2. a simple merge of your data, on the other hand, means that you are just performing a single analysis and producing a single set of results.

Obviously, to do #2, the respective study designs should be the same or as similar as possible. If differences lurk, then a meta analysis may be better. Even in this scenario, though, the end-points (classifications) should ideally be the same. It is possible to have multiple conditions, though, for example, Controls, Condition1, Condition2, Condition3 - this then becomes a multinomial regression problem, which limma can tolerate.

So, I cannot really give any definitive answer, as I don't know the other studies in which you're interested. If all studies are the same platform but have multiple end-points, I would just normalise them all together and get the average of your genes across all controls and each condition. This is if the array platforms are the exact same, though. With 10 studies, though, this is likely not as easy as I make out!

Note: end-points = conditions = classifications

ADD REPLYlink modified 5 months ago • written 5 months ago by Kevin Blighe37k

Kevin , To start with,I'm considering 3 studies.THe third study is GSE76896.The platform is the same for all these studies.

To simply merge the data and normalize ,I'm following the tutorial given in limma package. Is there any recommended library for meta analysis? I was looking for some tutorial to try some examples cases that would help me in understanding how to perform meta analysis. The package "GeneMeta" was suggested for meta analysis in one of the posts. Are there other alternatives?

Many thanks

ADD REPLYlink written 5 months ago by Natasha30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 741 users visited in the last hour