Hello, I have a gene sets from RNA-Seq for my non classic organism model. I have used BLAST2GO to blast all the sequences and map to them GO term. Then I preformed an enrichment analysis for these comparison, and I'm trying to understand how the algorithm is preforming corrections for cases in the gene ontology is more detailed for certain genes rather then others. obviously if gene x have more detailed ontology than gene y, that would create a bias for terms related to gene x. Any idea? BTW, I have a test tomorrow about this, so quick answers will be very much appreciated :-) Thank you! N.
If I understand correctly, there's a recent paper about this:
Impact of knowledge accumulation on pathway enrichment analysis by Wadi et al.
We analyzed the evolution of gene annotations over the past seven years and found that the vocabulary of pathways and processes has doubled. This strongly impacts practical analysis of genes: 80% of publications we surveyed in 2015 used outdated software that only captured 20% of pathway enrichments apparent in current annotations.
So yeah, if you use old GO terms, or most of your genes don't map to GO-terms, that will have an impact on your results.