whether using only part of DEGs that discovered into downstream analysis would decrease the power of results?
1
0
Entering edit mode
3.6 years ago
citronxu ▴ 20

Hello everyone,

I'm doing RNA-seq analysis on plants, mostly on Arabidopsis, and recently have a question whether only part of DEGs, i.e. up-regulated ones found between control and treatment, is applied to downstream, like GO or KEGG pathway analysis, would decrease the power of final enrichment results or not? So I have discussed about it with my colleague days before, and he thought if not all DEGs are gonna go into analysis, following consequence will be of statistically poor power since background, still whole database, does not change accordingly. while I'm thinking abou this issue, I cannot really figure out whether this would have a impact or not... welcome to any advice and comments. Many thanks in advance.

rna-seq kegg pathway analysis GO analysis • 776 views
ADD COMMENT
2
Entering edit mode
3.6 years ago

You're really just changing the question that you're asking from "What pathways/GO terms are enriched in my differentially expressed genes?" to "Which pathways/GO terms are enriched in my upregulated genes?"

There's nothing inherently wrong with changing the question that way, and in fact, it may reveal enriched terms/pathways that may be washed out with larger gene sets. And you can, and I'll argue that you should, change the background gene set for your enrichment analyses in many cases. I have some additional comments on that (and enrichment analyses in general) in this answer.

ADD COMMENT
0
Entering edit mode

Hello, many thanks for the reply. I totally agree with you on how important background gene set is, which would probably turns out 'expected' but indeed meaningless result at the end if wrong set is used. But in my case, I isolated RNA from general cells of leaves and roots without focusing on specific types so won't encounter issues above i suppose. And even afther the 'dilution' via larger data gene sets applied, the ranking of items will not really change right? So I think the analysis with part of DEGs could be conducted just as what would be done with total DEGs?

ADD REPLY
0
Entering edit mode

You're changing the lists - rankings could and would certainly change. If list A contains all of your 100 DEGs and 20 of those are associated with a given term, and list B contains just the 40 upregulated DEGs, and still contains 20 genes associated with the aforementioned term, the ranking and significance values for that term are going to change between lists.

And you could still limit your background genes to only those that are expressed in either of your tissues of interest. I imagine the background for flowers and stem are somewhat different from root and leaves.

ADD REPLY
0
Entering edit mode

I see, then I'll check literatures to figure out the general difference in gene sets between organs that are in my experiment then.

ADD REPLY
0
Entering edit mode

or the shresholds which define terms as 'significant' should be therefore adjust?

ADD REPLY

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6