Question

Go In Non-Model Organisms

1

Entering edit mode

10.7 years ago

nanana ▴ 120

I have a list of differentially expressed genes in Xenopus laevis that I'm looking to functionally annotate with GO-terms. As X.laevis is not listed in DAVID it seems I have 2 options:

BLAST my DE genes against a database with GO terms (human/mouse) to look for orthologs
Use a program that assigns GO from BLAST results, such as Blast2GO

I've been trying with Blast2GO recently, but find if very slow and generally lacking - does anyone have any experience doing GO in Xenopus or other non-model organisms?

rna-seq gene-ontology • 7.1k views

ADD COMMENT • link updated 10.6 years ago by Vladimir Chupakhin ▴ 520 • written 10.7 years ago by nanana ▴ 120

score 5 · Answer 1 · 2014-02-12

5

Entering edit mode

10.7 years ago

pld 5.1k

If you have local resources (a cluster) and some scripting capability, it might be faster to generate lists of orthologs between your model species and species with GO annotation. From there you can with some more scripting map one species GO annotation set onto your species through these orthos. I usually end up putting all of this into a SQL database to make it easier for future analysis and data. This has been the approach I've used for bats and primates. Python and MySQL with some spots of Bash. I believe there are modules for doing this in R/Bioconductor.

If you don't have access to this type of thing, you're best bet is Blast2GO. One advantage of the other method is that with the lists of orthologs you can still use a tool like DAVID.

ADD COMMENT • link 10.7 years ago by pld 5.1k

1

Entering edit mode

That's roughly what I was thinking - do you know the best GO annotated databases to blast against?

ADD REPLY • link 10.6 years ago by nanana ▴ 120

1

Entering edit mode

I went straight to the source: http://www.geneontology.org/. Downloaded annotations for the species that I was interested in. In those files it has Uniprot IDs and GO annotations. Each line represents a single annotation for a single protein. So if a protein has multiple annotations there will be multiple rows.

From there I just built my list of orthologs using a local blast set up. NCBI and Ensembl have readily downloadable lists of spequences for a given species, I used these to build the blast databases. Once you have your ortho list you can just look up annotations in the stuff from the GO annotation files.

So: model organism protein <-{reciprocal BLAST}-> GO organism protein -> GO annotation set for that protein

In the end this gives you: model organism protein -> GO annotation set

ADD REPLY • link 10.6 years ago by pld 5.1k

score 0 · Answer 2 · 2014-02-13

0

Entering edit mode

10.6 years ago

Vladimir Chupakhin ▴ 520

Maybe trying homologue/orthologue databases can be helpful. Uniprot mapping service can help with that.

ADD COMMENT • link 10.6 years ago by Vladimir Chupakhin ▴ 520