Blast Against Uniref90 Database Using Biopython?
4
1
Entering edit mode
8.3 years ago

Hi everyone: I'd like to know if is there any way to perform blast analysis on the internet using UniRef90 (UniProt) as database in biopython? Thanks!

biopython • 6.3k views
ADD COMMENT
2
Entering edit mode
8.3 years ago
Jerven ▴ 650

The official BLAST against UniRef and all UniProt data is hosted by the EBI this is actually the API that UniProt.org uses as well.

You can find the documentation in the EBI services pages. If you do wish to run 7000 jobs please be considerate and run one after the other.

ADD COMMENT
1
Entering edit mode

Thanks very much. I'll be considerated.

ADD REPLY
2
Entering edit mode
4.3 years ago
parism9 ▴ 30

This is quite old, but if time / speed is an issue, Diamond is an accelerated BLAST aligner that is very fast. I've used it with a downloaded UniRef90 database and it works both for local testing & development (2015 Macbook Pro) as well as on the production servers. I can share code with you if you want.

ADD COMMENT
1
Entering edit mode
8.3 years ago
Peter 6.0k

UniProt offer a webform for BLAST here, http://www.uniprot.org/blast/ - but this is as far as I know for use 'by hand' only rather than scripting from something like Biopython. I'm not aware of anyone else hosting a free to use BLAST service against UniProt.

The alternative would be to download the UniProt database (in FASTA format), make a local BLAST database with makeblastdb, and then search it locally with blastp (from the NCBI BLAST+ suite). Your local Linux systems administrator may already have local copies of important BLAST databases like the NCBI NR database, and could add the UniProt databases for you perhaps?

How many query sequences do you have? Even the NCBI online BLAST service is not suitable for large numbers of queries, and best done locally (on your cluster if needed).

ADD COMMENT
0
Entering edit mode

Thanks Peter. Actually I have downloaded UniRef90 database (more than 7 Gb unzipped) on my computer, I created the database, but it takes too long for my computer to perform blast analysis for one single query. So I was wondering how can I do this using the uniprot server from biopython. I have already did this analysis using the nr database from ncbi for more than 7,000 sequences, using biopython.

ADD REPLY
0
Entering edit mode

I doubt the NCBI would be very happy with your for BLAST'ing 7000 sequences against the NR like that, but at least they do officially expose this service. I've asked UniProt via Twitter if they have an official API, https://twitter.com/pjacock/status/383191702322020352

ADD REPLY
0
Entering edit mode

Jerven from UniProt has replied (as an answer to your question), https://twitter.com/jervenbolleman/status/383239462475399168

ADD REPLY
0
Entering edit mode

At least all blasts were carried out one after the other during the day. Thanks for your answer

ADD REPLY
1
Entering edit mode
8.3 years ago
vaskin90 ▴ 290

If it helps UniProt BLAST has an undocumented API through the GET-requests. And it actually works quite stable. Here is the sample URL: http://www.uniprot.org/blast/?query=TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN&dataset=uniprotkb&threshold=10&matrix=&filter=false&gapped=true&numal=250 With this link you start a job and the server sends you its ID (20130926517MACV51A). The status of the job is located by the URL like that: http://www.uniprot.org/blast/uniprot/20130926517MACV51A.stat.

ADD COMMENT
1
Entering edit mode

Please do not do this. Use the linked official API described here. Scripted blasting against the URI in this answer might be blocked or seen as abusive.

ADD REPLY
0
Entering edit mode

Very useful. Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2188 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6