Question: How To Use Python Retrieve Results From Uniprot Automatically
6
gravatar for anlin00007
6.9 years ago by
anlin0000790
United States
anlin0000790 wrote:

I wanna use a Gene Ontology term to get related sequences in Uniprot. It is simple to do it manually, however, I wanna use python to achieve it. Anybody has ideas with it? For example, I have GO:0070337, then I wanna download all the search results in a fasta file. Thanks

Thank you guys. Finally I chose URLLIB 'http://www.uniprot.org/uniprot/?query=go%3a'+GOTERM+'&force=yes&format=list'

python biopython uniprot • 11k views
ADD COMMENTlink modified 6.8 years ago • written 6.9 years ago by anlin0000790
4
gravatar for Niallhaslam
6.9 years ago by
Niallhaslam2.3k
Dublin
Niallhaslam2.3k wrote:

Or more specifically this FAQ: http://www.uniprot.org/faq/28

ADD COMMENTlink written 6.9 years ago by Niallhaslam2.3k
2

That's the same i linked.

ADD REPLYlink written 6.8 years ago by quentin.delettre430
4
gravatar for Leszek
6.8 years ago by
Leszek4.0k
IIMCB, Poland
Leszek4.0k wrote:

You can use this:
wget --quiet -O- ftp://ftp.uniprot.org/pub/databases/uniprot/currentrelease/knowledgebase/complete/uniprotsprot.dat.gz | zcat | python uniprot2go.py GO:0070337 GO:0016021 > out.fasta

#!/usr/bin/env python
"""Fetch uniprot entries for given go terms"""
import sys
from Bio import SwissProt
#load go terms
gos = set(sys.argv[1:])
sys.stderr.write("Looking for %s GO term(s): %s\n" % (len(gos)," ".join(gos)))
#parse swisprot dump
k = 0
sys.stderr.write("Parsing...\n")
for i,r in enumerate(SwissProt.parse(sys.stdin)):  
    sys.stderr.write(" %9i\r"%(i+1,))
    #parse cross_references
    for ex_db_data in r.cross_references:
        #print ex_db_data
        extdb,extid = ex_db_data[:2]
        if extdb=="GO" and extid in gos:
          k += 1
          sys.stdout.write( ">%s %s\n%s\n" % (r.accessions[0], extid, r.sequence) )
sys.stderr.write("Reported %s entries\n" % k)  

For me it's less than 6 minutes to parse the latest swissprot dump (it depends on your internet connection).
Of course, if you will run it multiple times, better download the dump and run it from local copy.

ADD COMMENTlink modified 6.8 years ago • written 6.8 years ago by Leszek4.0k
1

http://www.uniprot.org/uniprot/?query=go:MYGO&format=TXTORRDF&compress=YESORNO

I don't think there is more simple.

ADD REPLYlink modified 6.8 years ago • written 6.8 years ago by quentin.delettre430

cool, was not aware of this. anyway, I run cross-reference with UniProt regularly and someone may benefit from the code:)

ADD REPLYlink written 6.8 years ago by Leszek4.0k
2
gravatar for quentin.delettre
6.9 years ago by
France
quentin.delettre430 wrote:

Read the faq and manual from Uniprot. Do your homework.

Faq & Help

ADD COMMENTlink written 6.9 years ago by quentin.delettre430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1747 users visited in the last hour