Question: Bias in GO enrichment analysis for non-model organsim?
1
gravatar for yh362
5 weeks ago by
yh36230
yh36230 wrote:

I am currently working on a non-model plant species and after running interproscan, I realized that only a little over half (~38000 out of 63000 genes) get at least one GO term assigned to it. That way, if I were to do a GO enrichment analysis, some gene of interest (say, differentially expressed genes) may not have a GO term associated with it and, I suppose, that information would be lost in the enrichment analysis. So it GO enrichment analysis inherently biased/unreliable for non-model organism? If someone can point to some papers that discuss this that would be very helpful. If I were wrong, please correct me since I am new to this kind of analysis. Thanks in advance!

ADD COMMENTlink modified 5 weeks ago by Philipp Bayer6.4k • written 5 weeks ago by yh36230
0
gravatar for Philipp Bayer
5 weeks ago by
Philipp Bayer6.4k
Australia/Perth/UWA
Philipp Bayer6.4k wrote:

Yes, this is an important point that is often ignored in plant genomics papers! Here's a paper discussing GO database bias, with a bit on the bias imported by unannotated genes, but this is all in human: https://www.nature.com/articles/s41598-018-23395-2

The other problem is that most of your genes for GO term liftover are Arabidopsis, and that will introduce further bias, there are quite a few Arabidopsis genes which have different functions in close relatives (example: https://www.nature.com/articles/hortres201454 )

As far as I know, there is no package which takes this uncertainty in account, so yeah, if you redo your GO-enrichment analysis from scratch in 10 years chances are you'll get different results.

ADD COMMENTlink written 5 weeks ago by Philipp Bayer6.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1199 users visited in the last hour