Question: How to create your own association file for gene ontology enrichment analysis ?
2
gravatar for jack
3.5 years ago by
jack710
Germany
jack710 wrote:

Hi,

 

I want to do the Gene enrichment using Ontologizer without a predefined association file(it's not model organism). 

I have parsed a file with two columns for that organism like this : 

geneA  GO:0006950,GO:0005737
geneB  GO:0016020,GO:0005524,GO:0006468,GO:0005737,GO:0004674,GO:0006914,GO:0016021,GO:0015031
geneC  GO:0003779,GO:0006941,GO:0005524,GO:0003774,GO:0005516,GO:0005737,GO:0005863
geneD  GO:0005634,GO:0003677,GO:0030154,GO:0006350,GO:0006355,GO:0007275,GO:0030528

 

also I have downloaded the .ob file from Gene ontology file which contain this information (http://www.geneontology.org/doc/GO.terms_and_ids) : 



!
! GO IDs (primary only) and name text strings
! GO:0000000 [tab] text string [tab] F|P|C
! where F = molecular function, P = biological process, C = cellular component
!
GO:0000001  mitochondrion inheritance   P
GO:0000002  mitochondrial genome maintenance    P
GO:0000003  reproduction    P
GO:0000005  ribosomal chaperone activity    F
GO:0000006  high affinity zinc uptake transmembrane transporter activity    F
GO:0000007  low-affinity zinc ion transmembrane transporter activity    F
GO:0000008  thioredoxin F
GO:0000009  alpha-1,6-mannosyltransferase activity  F
GO:0000010  trans-hexaprenyltranstransferase activity   F
GO:0000011  vacuole inheritance P

 

what I need as output is .gaf file in the following format (in the format of the files at here: http://geneontology.org/page/download-annotations):

!gaf-version: 2.0

!Project_name: Leishmania major GeneDB

!URL: http://www.genedb.org/leish

!Contact Email: mb4@sanger.ac.uk

GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0003723    PMID:22396527    ISO    GeneDB:Tb927.10.10130    F    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor       

GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0044429    PMID:20660476    ISS        C    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20100803    GeneDB_Lmajor       

GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0016554    PMID:22396527    ISO    GeneDB:Tb927.10.10130    P    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor       

GeneDB_Lmajor    LmjF.36.4770    LmjF.36.4770        GO:0048255    PMID:22396527    ISO    GeneDB:Tb927.10.10130    P    mitochondrial RNA binding complex 1 subunit, putative    LmjF36.4770    gene    taxon:347515    20120910    GeneDB_Lmajor 

 

Can someone help me with a bash script to do this ? 

ADD COMMENTlink modified 15 months ago by xzpgocxx0 • written 3.5 years ago by jack710

which organism are you working with?

ADD REPLYlink written 3.5 years ago by EagleEye5.5k

Hello, I also want to do the same work. So, how to resolve this problem of you?

ADD REPLYlink written 15 months ago by xzpgocxx0

If your genes are already annotated and all you need is put these annotations in the GO annotation file format, the format is described here. Use your favorite scripting language to convert from what you have to this format. You probably only need the first 9 columns.

ADD REPLYlink written 15 months ago by Jean-Karim Heriche16k

Also have a look at GeneSCF, if it works for you.

Note: Check this question regarding non-model organism and GeneSCF usage.

ADD REPLYlink modified 15 months ago • written 15 months ago by EagleEye5.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1005 users visited in the last hour