Question

How to create your own association file for gene ontology enrichment analysis ?

2

Entering edit mode

10.4 years ago

jack ▴ 990

Hi,

I want to do the Gene enrichment using Ontologizer without a predefined association file(it's not model organism).

I have parsed a file with two columns for that organism like this:

geneA  GO:0006950,GO:0005737
geneB  GO:0016020,GO:0005524,GO:0006468,GO:0005737,GO:0004674,GO:0006914,GO:0016021,GO:0015031
geneC  GO:0003779,GO:0006941,GO:0005524,GO:0003774,GO:0005516,GO:0005737,GO:0005863
geneD  GO:0005634,GO:0003677,GO:0030154,GO:0006350,GO:0006355,GO:0007275,GO:0030528

Also I have downloaded the .ob file from Gene ontology file which contain this information (http://www.geneontology.org/doc/GO.terms_and_ids):

!
! GO IDs (primary only) and name text strings
! GO:0000000 [tab] text string [tab] F|P|C
! where F = molecular function, P = biological process, C = cellular component
!
GO:0000001  mitochondrion inheritance   P
GO:0000002  mitochondrial genome maintenance    P
GO:0000003  reproduction    P
GO:0000005  ribosomal chaperone activity    F
GO:0000006  high affinity zinc uptake transmembrane transporter activity    F
GO:0000007  low-affinity zinc ion transmembrane transporter activity    F
GO:0000008  thioredoxin F
GO:0000009  alpha-1,6-mannosyltransferase activity  F
GO:0000010  trans-hexaprenyltranstransferase activity   F
GO:0000011  vacuole inheritance P

what I need as output is .gaf file in the following format (in the format of the files at here: http://geneontology.org/page/download-annotations):

!gaf-version: 2.0
!Project_name: Leishmania major GeneDB
!URL: <a href="http://www.genedb.org/leish" target="_blank">http://www.genedb.org/leish</a>
!Contact Email: <a href="mailto:mb4@sanger.ac.uk" target="_blank">mb4@sanger.ac.uk</a>
GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0003723    PMID:22396527    ISO    GeneDB:Tb927.10.10130    F    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor       
GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0044429    PMID:20660476    ISS        C    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20100803    GeneDB_Lmajor       
GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0016554    PMID:22396527    ISO    GeneDB:Tb927.10.10130    P    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor       
GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0048255    PMID:22396527    ISO    GeneDB:Tb927.10.10130    P    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor

Can someone help me with a bash script to do this?

enrichment ontology R software-error RNA-Seq • 6.2k views

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.4 years ago by jack ▴ 990

0

Entering edit mode

which organism are you working with?

ADD REPLY • link 10.4 years ago by EagleEye 7.6k

0

Entering edit mode

Hello, I also want to do the same work. So, how to resolve this problem of you?

ADD REPLY • link 8.2 years ago by xzpgocxx ▴ 20

0

Entering edit mode

If your genes are already annotated and all you need is put these annotations in the GO annotation file format, the format is described here. Use your favorite scripting language to convert from what you have to this format. You probably only need the first 9 columns.

ADD REPLY • link 8.2 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Also have a look at GeneSCF, if it works for you.

Note: Check this question regarding non-model organism and GeneSCF usage.

ADD REPLY • link 8.2 years ago by EagleEye 7.6k