Python Function That Searches For Go Terms Of Proteins
1
0
Entering edit mode
10.1 years ago
nadia.sl89 • 0

Hi,

How can I make a Python function that, for each protein in a FASTA file, searches for the GO terms in UniProt?

What is the script that I need to use?

Thank you!!!

python go • 3.8k views
ADD COMMENT
1
Entering edit mode
10.1 years ago
Xingyu Yang ▴ 280

You need biopython. Fisrt, you need to blast your sequences to uniprot using qblast() function in Bio.Blast.NCBIWWW. See instruction here: http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec95

Second, Filter the blast results. For example, one can be considered as a good ortholog only when the matched length is longer than 60% of the whole length of the query protein.

Third, retrieve Go information for the best blast hit. You still need biopython to do so. http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec152
You will get a variable named "record" as shown in the example. Its attribute "cross_annotation"(or similar name, I don't remember clearly, you can use dir(record) to see its exact name)

ADD COMMENT
0
Entering edit mode

However, I highly recommend you to download the uniprot database and run the blast as well as the annotation locally if you need to run a large sets of proteins.

ADD REPLY

Login before adding your answer.

Traffic: 3426 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6