Question: Obtaining "Gene-Association" File For Go Term Analysis
gravatar for Arun
8.6 years ago by
Arun2.3k wrote:

Hi, I am trying to perform GO term enrichment analysis using GO::TermFinder Perl module. I have already gone thro' most of the posts that I found relevant to me here on biostar. Especially this one. Amidst all other softwares, I'd like to be able to use GO::TermFinder as I find it a bit more powerful tool and I am quite comfortable with Perl.

Now, I require two gene files (1) candidate genes and 2) background genes) and in addition, the gene ontology term definitions and finally a gene-association file: here's an example. I am working on tomatoes (Solanum lycopersicum). And the genome is about to be published and as far as I have searched, there doesn't exist a gene-association file (for Arabidopsis thaliana it seems to exist, of course). Its seems to be basically a file with geneID and corresponding GO terms (+ some more info). Now, I have the GO terms for 22000 (out of 37000) genes from the current gene model annotation. My question is, is there any software or an alternative easier approach (I don't mind coding) to gather info required to construct this file. I just have the geneID, GO terms and functional annotations.

This seems to be the only missing link for me to use GO::TermFinder (and for that matter, any other softwares, I'd suppose?? ).

gene • 5.2k views
ADD COMMENTlink modified 6.9 years ago by Biostar ♦♦ 20 • written 8.6 years ago by Arun2.3k

Hi Arun, I have a problem similar to yours (but I hope you have solved it). I'm working on tomato genes and I need a gene association file to use Ontologizer. I have downloaded the same file you used to obtain GO terms (ITAG2.3genemodels), but I have been able to isolate only about 19000 GO terms, so the first question I ask you is the method you used for extracting associations between GO terms and gene names. It would be very useful for me to retrieve about 3000 more genes! Then I ask you if you have found a method to obtain a gene association file.

Thank you in advance for your help, Raffaella

ADD REPLYlink written 7.8 years ago by raffaelladalessandro0

Hi Raffaella, I have same questions you asked here and apparently, they are old questions. I tried to get the gene association file provided by the Solanaceae genome group but it still contains GO terms for only about 500 genes. I am wondering if you've succeeded to generate a gene association file for tomato, and I also would like to know the method you used to obtain GO terms.


ADD REPLYlink written 7.0 years ago by cau.linxin0

I switched to MapMan software+annotations.

ADD REPLYlink written 7.0 years ago by Arun2.3k
gravatar for Damian Kao
8.6 years ago by
Damian Kao15k
Damian Kao15k wrote:

I used Ontologizer for a lot of my enrichments.

I recently wrote a blog post about using ontologizer and creating a fake .gaf file to use with it:

ADD COMMENTlink written 8.6 years ago by Damian Kao15k

Thank you DK, I'll have a look at it and come back with questions (or accept it as answer).

ADD REPLYlink written 8.6 years ago by Arun2.3k
gravatar for Rama
8.6 years ago by
Rama10 wrote:


The Solanaceae genome group provide a gene association file called gene_association.sgn.gz. It is available from the Gene Ontology Consortium's Downloads site-

Gene Ontology Terms and definitions can also be downloaded from GOC's downloads site- Download the 'gene_ontology_edit.obo' file.

BTW, if you would like to submit the GO terms for the 37000 genes please contact ( the Gene Ontology Consortium. We will be happy to help you.

Please also visit the Solanaceae genome website for more information -



ADD COMMENTlink modified 13 months ago by RamRS30k • written 8.6 years ago by Rama10

Hi Rama, I have seen this file. Unfortunately, they are available for only around 500 genes. The ITAG2.3_gene_models file has GO terms for 22K genes... Why the difference?

ADD REPLYlink written 8.6 years ago by Arun2.3k

The GO annotations SGN submits to GOC are currently only of manually curated functional genes (of any species we host). ITAG2.3 is the tomato genome annotation set of predicted gene models. Most of it comes from InterPro, and is not yet submitted to GOC .


ADD REPLYlink written 8.5 years ago by naama.menda0

Hi Rama,

I work on two genomes that are widely used in research these days, the horse and the sheep genome. Any idea when the gene association file will be ready for these two genomes? A lot of people will be very happy to use them asap.

Thanks! M.

ADD REPLYlink modified 13 months ago by RamRS30k • written 6.1 years ago by madkitty620
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1075 users visited in the last hour