Question

Relevance of RNAseq meta-analysis

0

Entering edit mode

4.9 years ago

guillaume.rbt ★ 1.0k

Hi all,

I'm starting a project where I'm going to gather several human public RNAseq datasets to perform a differential expression meta-analysis, the objective is to analyse multiple study to detect a signal that wouldn't be found in individual study due to low number of samples.

I will end up with a lot of samples (>500), and since I'm not a statistician I'm wondering what issues I might face with this high number.

Should I expect to gain power with a large meta-dataset? Or will mixing several studies will bring too much confounding effects?

Is there some threshold in the number of samples I should gather? maybe adding more and more will just bring noise and make the analysis more difficult?

In you opinion, will a tool like DEseq2 will be fitted to analyze this kind of large dataset? Or should I use another type of approach to detect differential expressed genes?

Thank you in advance for any of your input on this

RNA-Seq differential expression • 1.3k views

ADD COMMENT • link updated 4.9 years ago by Asaf 10k • written 4.9 years ago by guillaume.rbt ★ 1.0k

1

Entering edit mode

Just keep in mind that whenever you mix different datasets together, you will have batch and other confounding effects, which sometimes would be even not possible to resolve. So at least try to retrieve datasets which are very similar to each other rom the perspective of sequencing technology, instrument, read length, etc, and of course biological condition.

ADD REPLY • link 4.9 years ago by grant.hovhannisyan ★ 2.6k

0

Entering edit mode

Ok thanks, I will keep in mind those constraints.

ADD REPLY • link 4.9 years ago by guillaume.rbt ★ 1.0k

score 2 · Answer 1 · 2019-06-20

2

Entering edit mode

4.9 years ago

Asaf 10k

A few thoughts:

Normalization is going to be tricky. I would normalize each study by itself.
I think limma would be more straight forward here after you'll have all the normalized studies and you can easily have study as a confounder
I don't think you can have too many samples, just be careful to test the effect along with the p-value.

Good luck

ADD COMMENT • link 4.9 years ago by Asaf 10k

0

Entering edit mode

Thank you for your help, I will compare DEseq2 and limma to see what are the results.

ADD REPLY • link 4.9 years ago by guillaume.rbt ★ 1.0k