Bias in GO enrichment analysis for non-model organsim?
1
1
Entering edit mode
4.8 years ago
yh362 ▴ 50

I am currently working on a non-model plant species and after running interproscan, I realized that only a little over half (~38000 out of 63000 genes) get at least one GO term assigned to it. That way, if I were to do a GO enrichment analysis, some gene of interest (say, differentially expressed genes) may not have a GO term associated with it and, I suppose, that information would be lost in the enrichment analysis. So it GO enrichment analysis inherently biased/unreliable for non-model organism? If someone can point to some papers that discuss this that would be very helpful. If I were wrong, please correct me since I am new to this kind of analysis. Thanks in advance!

gene ontology GO enrichment topGO interproscan • 1.3k views
ADD COMMENT
0
Entering edit mode
4.8 years ago

Yes, this is an important point that is often ignored in plant genomics papers! Here's a paper discussing GO database bias, with a bit on the bias imported by unannotated genes, but this is all in human: https://www.nature.com/articles/s41598-018-23395-2

The other problem is that most of your genes for GO term liftover are Arabidopsis, and that will introduce further bias, there are quite a few Arabidopsis genes which have different functions in close relatives (example: https://www.nature.com/articles/hortres201454 )

As far as I know, there is no package which takes this uncertainty in account, so yeah, if you redo your GO-enrichment analysis from scratch in 10 years chances are you'll get different results.

ADD COMMENT

Login before adding your answer.

Traffic: 1746 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6