Question: Why does GO enrichment result give different results when gene list cutoff change?
gravatar for hellocita
3.0 years ago by
hellocita20 wrote:

I am new to GO annotation. I use DAVID to do GO annotation, which calculate the gene overrepresentation by fisher exact test. I have gene list with FDR cutoff, in my point of view, if I choosing FDR <= 10% gene for GO annotation, the matched GO terms should have some overlap with FDR <= 5% ones because the two lists have many gene overlap and the last one is with higher confidence, however, it's totally different, and I doubt the GO annotation result with FDR <= 10% gene is true?

How can the annotation be not robust with the given gene set changed? and is there any ways/paper/packages to permitted this? Thanks!

rna-seq gene • 1.3k views
ADD COMMENTlink modified 3.0 years ago by theobroma221.1k • written 3.0 years ago by hellocita20

There should ideally be a good overlap between the two, but it is definitely not guaranteed. For example, how many genes have an FDR lower than 0.05 and how many lower than 0.1? It's possible that the latter set is a lot larger and therefore the gene overlap isn't big itself.

A common geneset enrichment tool that doesn't depend on a threshold is GSEA, but there are really a lot of algorithms available. You can pick the one that best suits your needs.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Martombo2.7k

Thank you @Martombo, the gene number change should be the reason. On FDR 10%, I have 390 genes in list and 20 GO terms enriched(BH corrected fisher test p-value <0.05). However on FDR 5% , I have only 76 genes in list and no GO terms called significant, even if I relax the p-value to be higher(fisher test p-value < 0.1) to have some GO terms enriched, still no overlaps with the first list and even looks totally different. I should figure out other ways to intepret the gene list. And thanks for your suggestion!

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by hellocita20
gravatar for theobroma22
3.0 years ago by
theobroma221.1k wrote:

They are both true. You have to consider the math behind the enrichment giving you the result. If you change the number of genes you change the result because the total number of genes for GO category X is a factor determining the significance of those genes and that category. Ninety nine percent of the time, if you change the input the output changes.

ADD COMMENTlink written 3.0 years ago by theobroma221.1k

thank you @theobroma22, but how can I trust the enrichment result if they will be changed when FDR of gene list is changed?

ADD REPLYlink written 3.0 years ago by hellocita20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1568 users visited in the last hour