How To Find Go Numbers Given A Protein Coding Sequence?
2
1
Entering edit mode
10.7 years ago

I have a large collection of protein coding sequences and I'd like to find GO numbers for them in order to categorize them by function. I can automate the NCBI BLAST process, yielding accession numbers (like XM_329174.1, NM_234985.3, etc.), but don't know how to convert those to GO numbers. Is there a database that maps NCBI accession numbers to GO numbers? Also, is there a source that describes the GO hierarchy in a machine-readable form?

go gene ontology protein • 4.9k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Please search this site for "biomart".

ADD REPLY
0
Entering edit mode
10.7 years ago
Björn ▴ 670

Hi Joshua,

if you start with a sequence you can also try Blast2GO. It is also available as standalone version and has a Galaxy Integration.

Cheers,

Bjoern

ADD COMMENT
0
Entering edit mode
10.7 years ago

Another alternative may be to use Interpro. Iinterpro has started an effort to annotate all HMMs with GO terms, so this might be a valuable addition to your data.

You can use the Interproscan executable for automated command line usage. Runtime is only slighty above that of BLAST.

ADD COMMENT

Login before adding your answer.

Traffic: 907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6