Where to find *full* GO gene annotation file?
1
0
Entering edit mode
6.8 years ago
iamjli ▴ 10

Hi,

I've been struggling with finding a full, unfiltered GO annotation file (GAF). A number of different GO enrichment sites point to here to download the GAF file: http://www.geneontology.org/page/download-annotations

But this is apparently not complete. For instance, when I use GOrilla, GO:0010604 appears as an enriched pathway. However, GO:0010604 does not show up in the file from the gene ontology website.

What am I missing here? Where can I find the full file?

Thanks!

gene ontology GAF GO enrichment • 4.9k views
ADD COMMENT
0
Entering edit mode

I do find GO:0010604 (positive regulation of macromolecule metabolic process) in Amigo and the Ontology Lookup Service so I wonder which gene ontology website you are referring to.

ADD REPLY
0
Entering edit mode

Yes, I'm wondering why AmiGO has it, but not the GAF file. I am referring to this file in particular: http://geneontology.org/gene-associations/goa_human.gaf.gz

ADD REPLY
1
Entering edit mode
6.8 years ago

Make sure that you distinguish between an annotation and a definition. The GO:0010604 is a term definition that is present in the GO definition file.

wget http://purl.obolibrary.org/obo/go.obo

check the term of interest:

cat go.obo | grep -A 3 "id: GO:0010604"

produces:

id: GO:0010604
name: positive regulation of macromolecule metabolic process
namespace: biological_process
def: "Any process that increases the frequency, rate or extent of the chemical reactions and pathways involving macromolecules, any molecule of high relative molecular mass, the structure of which essentially comprises the multiple repetition of units derived, actually or conceptually, from molecules of low relative molecular mass." [GOC:dph, GOC:tb]

Now this term may not be present in an annotation file if none of the gene products have been annotated with this term.

ADD COMMENT
0
Entering edit mode

Yes, I understand the difference between the two files. Why does GO:0010604 appear in AmiGO and GOrilla, but not in the .gaf file (this is the one I'm referring to: http://geneontology.org/gene-associations/goa_human.gaf.gz)

ADD REPLY
1
Entering edit mode

Ok, now I understand what you mean.

The GAF file is complete and non-redundant. It contains the minimal number of annotations necessary to annotate the data. So for example GO:0045893 is a GO:0045893 that in turn is a GO:0010604.

When the annotation file contains GO:0045893 it also means that the annotated genes are also annotated with the ancestors of this term. But these entries will not be entered in the file. Tools like AmiGO will search not just the leaf nodes but intermediate ones in case there is support for those.

ADD REPLY
0
Entering edit mode

Gotcha, thanks for the info, Istvan. I guess my question is then where can I download the annotations that AmiGO uses? I see they have an option to download filtered searches, but I'm looking for the entire GAF file.

ADD REPLY
0
Entering edit mode

AmiGO uses the same annotations - it is when they parse it they build a data model in the program itself - and that is the service that they offer.

ADD REPLY

Login before adding your answer.

Traffic: 2507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6