Question

Python Function That Searches For Go Terms Of Proteins

0

Entering edit mode

11.3 years ago

nadia.sl89 • 0

Hi,

How can I make a Python function that, for each protein in a FASTA file, searches for the GO terms in UniProt?

What is the script that I need to use?

Thank you!!!

python go • 4.1k views

ADD COMMENT • link updated 11.3 years ago by Xingyu Yang ▴ 280 • written 11.3 years ago by nadia.sl89 • 0

score 1 · Answer 1 · 2014-03-31

You need biopython. Fisrt, you need to blast your sequences to uniprot using qblast() function in Bio.Blast.NCBIWWW. See instruction here: http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec95

Second, Filter the blast results. For example, one can be considered as a good ortholog only when the matched length is longer than 60% of the whole length of the query protein.

Third, retrieve Go information for the best blast hit. You still need biopython to do so. http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec152
You will get a variable named "record" as shown in the example. Its attribute "cross_annotation"(or similar name, I don't remember clearly, you can use dir(record) to see its exact name)