Question: Relevance of RNAseq meta-analysis
gravatar for guillaume.rbt
11 months ago by
guillaume.rbt790 wrote:

Hi all,

I'm starting a project where I'm going to gather several human public RNAseq datasets to perform a differential expression meta-analysis, the objective is to analyse multiple study to detect a signal that wouldn't be found in individual study due to low number of samples.

I will end up with a lot of samples (>500), and since I'm not a statistician I'm wondering what issues I might face with this high number.

Should I expect to gain power with a large meta-dataset? Or will mixing several studies will bring too much confounding effects?

Is there some threshold in the number of samples I should gather? maybe adding more and more will just bring noise and make the analysis more difficult?

In you opinion, will a tool like DEseq2 will be fitted to analyze this kind of large dataset? Or should I use another type of approach to detect differential expressed genes?

Thank you in advance for any of your input on this

ADD COMMENTlink modified 11 months ago by Asaf7.6k • written 11 months ago by guillaume.rbt790

Just keep in mind that whenever you mix different datasets together, you will have batch and other confounding effects, which sometimes would be even not possible to resolve. So at least try to retrieve datasets which are very similar to each other rom the perspective of sequencing technology, instrument, read length, etc, and of course biological condition.

ADD REPLYlink written 11 months ago by grant.hovhannisyan1.9k

Ok thanks, I will keep in mind those constraints.

ADD REPLYlink written 11 months ago by guillaume.rbt790
gravatar for Asaf
11 months ago by
Asaf7.6k wrote:

A few thoughts:

  1. Normalization is going to be tricky. I would normalize each study by itself.
  2. I think limma would be more straight forward here after you'll have all the normalized studies and you can easily have study as a confounder
  3. I don't think you can have too many samples, just be careful to test the effect along with the p-value.

Good luck

ADD COMMENTlink written 11 months ago by Asaf7.6k

Thank you for your help, I will compare DEseq2 and limma to see what are the results.

ADD REPLYlink written 11 months ago by guillaume.rbt790
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 969 users visited in the last hour