Question: Go Enrichment For Different P Values Problem
gravatar for DoubleDecker
6.5 years ago by
United Kingdom
DoubleDecker140 wrote:

I am somewhat surprised by my GO enrichment analysis resuts. Is it possible that using gene sets with lower p-value threshold for differential gene expression, gives me more GO terms than using higher p-value threshold? I would expect that as I make the criterion for differential gene expression more stringent, the number of enriched/underepresented terms would go down and certainly no new terms pop up.

enrichment go • 3.1k views
ADD COMMENTlink written 6.5 years ago by DoubleDecker140
gravatar for miquelduranfrigola
6.4 years ago by
miquelduranfrigola760 wrote:

If I understand it correctly, you obtain more enriched GO terms with a smaller (more conservative) set of differentially expressed genes. You were expecting to obtain less enriched GO terms therefrom, right?

In my opinion, what you are seeing is perfectly possible. GO has a DAG structure, and most GO enrichment analysis tools take this DAG into account. You might well be obtaining more specific terms now, for instance. Have a look at the depth of your enriched terms.

If, at the end, what you want to achieve is a reduction of the number of terms, why don't you just go more conservative in the GO enrichment analysis?

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by miquelduranfrigola760

Yes, I have got it now... thanks Miguel and the rest of contributors!

ADD REPLYlink written 6.4 years ago by DoubleDecker140
gravatar for Istvan Albert
6.5 years ago by
Istvan Albert ♦♦ 78k
University Park, USA
Istvan Albert ♦♦ 78k wrote:

It is a bit unclear what you mean above. If you're asking wether a different geneset with lower p-value could produce more GO terms, then that is just fine. If you mean that the same geneset produces more GO terms at a lower p-value cutoff then it is most likely that the cutoff parameter is doing something else than just applying the cutoff.

ADD COMMENTlink written 6.5 years ago by Istvan Albert ♦♦ 78k

Yes, I used the same gene set. I thought of a way why this might be happening. For example, I might have 1000 differentially expressed genes with a given term at .05 cutoff, and 500 are upregulated, 500 downregulated. Plus 10,000 genes with this GO term show no differential expression. However, at .005 cutoff there are only 100 genes which are differentially expressed, 90 of them upregulated and 10 downregulated. And there are 10,900 genes now which show no differential expression at this level. So I can imagine, that you might get enrichment of this term at .005 cutoff and no enrichment at .05 cutoff, correct?

ADD REPLYlink written 6.5 years ago by DoubleDecker140

By p-value you are actually talking about significance of differential expression, not the significance of enrichment? And you are looking at enrichment of the list of genes that are NOT differentially expressed? If that's true, then yes, a more stringent p-value threshold on differential expression will increase the number of genes in your NOT differentially expressed genes, resulting in possibly more GO enrichments.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by Damian Kao15k

Yes, I meant p-value relating to significance of differential expresssion and I am looking for enrichment of DIFFERENTIALLY expressed genes.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by DoubleDecker140

I guess it is possible to get more enrichment of GO terms as the input list get smaller. It really all depends on how similar your genes are what the GO terms associated are. I can imagine a situation where a list of 1000 genes have extremely varied functions resulting in no enrichment versus a a list of 100 genes all involved in cell cycle resulting in many GO terms involved in cell cycle.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by Damian Kao15k

Which I think is not just a comment but the actual answer to the question...

ADD REPLYlink written 6.5 years ago by Chris Evelo9.9k

ok but that means exactly that it is not the same geneset you are changing the group based on some parameter. In which case it is not surprising to get a different result.

ADD REPLYlink written 6.5 years ago by Istvan Albert ♦♦ 78k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1318 users visited in the last hour