Question

Comparing the gene expression data from the experiments performed using same platform

0

Entering edit mode

5.6 years ago

Natasha ▴ 40

Hi All, Recently, I was going through some of the posts on how to compare the gene expression data, obtained from microarray experiments, reported in various studies. In most of the answers found on this forum, it was suggested by many to use RankProd.

I have collected data from a set of 10 studies performed on the same platform. I would like to ask for suggestions on what could be the best approach that is recommended for comparing datasets from different studies, same platform?

microarray gene expression • 1.8k views

ADD COMMENT • link updated 5.6 years ago by Kevin Blighe 87k • written 5.6 years ago by Natasha ▴ 40

score 1 · Answer 1 · 2018-09-03

1

Entering edit mode

5.6 years ago

Kevin Blighe 87k

You should just process them all together but include 'experiment' (or study) as a covariate when fitting your linear model with, presumably, limma.

RankProd is designed for meta-analysis. Are you doing a meta-analysis or do you just want to have a single experiment with a combined sample n that is larger than any individual sample ns?

Kevin

ADD COMMENT • link 5.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks a lot for the advise. I am sightly confused, my understanding is combining data sets for further analysis is meta-analysis.I'm not sure about the clear distinction. Are these different?

Explaining a bit more on my end goal, What I want to do is the following, Considering the data from two studies Study 1 :20 samples (Classification of cell type,given in GSE20966, - 10 samples labelled "Control", 10 samples labelled "Type 2 Diabetic") Study 2 :90 samples (Classification of cell type, given in GSE53454,- 45 samples labelled 'Control",45 labelled "IFN-beta and IFN-gamma") I want to combine the data from these two studies(for my real case, I have 8 more studies) and find an average of the absolute value of the gene expression of 30 genes of my interest, with two classification (i) control and (ii) diabetic.

Since the classification of cell type varies from study to study ,I am not sure how this can be achieved. Could you please suggest what could be the right way to proceed?

I had a chance to look at the documentation of the RankProd package, a couple of days ago. From what I understand, the end result of the analysis performed using RandProd is a quantitative measure of the ranks of genes that is obtained after comparing the fold change ratio across samples and experiments.

I'm still trying to get a hold of how limma package works

Many thanks for the tremendous support

ADD REPLY • link 5.6 years ago by Natasha ▴ 40

1

Entering edit mode

The distinction between a meta analysis and simply merging all of your raw data together comes as follows:

a meta analysis looks at the results of each independent study and will determine those that corroborate (or not). Each study is performed independently and it's just the results that are compared
a simple merge of your data, on the other hand, means that you are just performing a single analysis and producing a single set of results.

Obviously, to do #2, the respective study designs should be the same or as similar as possible. If differences lurk, then a meta analysis may be better. Even in this scenario, though, the end-points (classifications) should ideally be the same. It is possible to have multiple conditions, though, for example, Controls, Condition1, Condition2, Condition3 - this then becomes a multinomial regression problem, which limma can tolerate.

So, I cannot really give any definitive answer, as I don't know the other studies in which you're interested. If all studies are the same platform but have multiple end-points, I would just normalise them all together and get the average of your genes across all controls and each condition. This is if the array platforms are the exact same, though. With 10 studies, though, this is likely not as easy as I make out!

Note: end-points = conditions = classifications

ADD REPLY • link 5.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Kevin , To start with,I'm considering 3 studies.THe third study is GSE76896.The platform is the same for all these studies.

To simply merge the data and normalize ,I'm following the tutorial given in limma package. Is there any recommended library for meta analysis? I was looking for some tutorial to try some examples cases that would help me in understanding how to perform meta analysis. The package "GeneMeta" was suggested for meta analysis in one of the posts. Are there other alternatives?

Many thanks

ADD REPLY • link 5.6 years ago by Natasha ▴ 40

0

Entering edit mode

I know its an old post, but I guess if you have the same platform for all studies as Natasha has mentioned then its not a meta-analysis???

But many such experiments from different platforms combine together calls for a meta-analysis???

Pls find my related question below - Posted it as a separate question, unfortunately no answers still. (Combining different conditions of study from different experiments (same platform) for microarray meta-analysis)

So, for a single DE analysis, would it be valid to combine control and test conditions from different experiments but same platform and then normalize them? Do we need to implement ComBat or sva in such cases?

And with regards to meta-analysis - can I use different such comparisons from different platforms (But same control and test conditions), find normalized values for each platform separately and go further with effect size analysis (after identifying the ratio of normalized values from both conditions for each platform) ???

ADD REPLY • link 5.1 years ago by devikaparvathy ▴ 50

1

Entering edit mode

Hey, I have now answered your other question. I originally saw it but wanted to see if anybody else was going to respond.

Regarding meta analysis, I believe it is platform independent. So, can be the same platform or different platforms. The 2 studies should be matched on key demographic and clinical features though.

ADD REPLY • link 5.1 years ago by Kevin Blighe 87k