Using the blast api from Python (3)
1
0
Entering edit mode
4.3 years ago
Freek ▴ 30

Hi all,

I'm trying to use the blast api (https://blast.ncbi.nlm.nih.gov/Blast.cgi) from Python using the requests module. My goal is to send a sequence and get genomic (Ensembl GRCh38) coordinates back.

request = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?QUERY=gagtctcctttggaactctgcaggttctatttgctttttcccagatgagctctttttctggtgtttgtct&DATABASE=nt&PROGRAM=blastn&CMD=Put&FORMAT_TYPE=JSON2'

(This sequence is part of the ACTB gene)

I sent it to the server like this:

response = requests.get(request)

The response looks like:

print(response)
<Response [200]="">
print(response.headers)
{'Server': 'Apache', 'Set-Cookie': 'BlastCubbyImported=passive; domain=ncbi.nlm.nih.gov, MyBlastUser=1lgZT_2PBCUePBfITK86610D67; domain=.ncbi.nlm.nih.gov; path=/, ncbi_sid=5AAB86A694B876A1_0000SID; domain=.nih.gov; path=/; expires=Fri, 22 Jun 2018 09:01:30 GMT', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Content-Security-Policy': 'upgrade-insecure-requests', 'X-UA-Compatible': 'IE=Edge', 'Cache-Control': 'private', 'Referrer-Policy': 'origin-when-cross-origin', 'NCBI-SID': '5AAB86A694B876A1_0000SID', 'NCBI-PHID': '5AAB86A694B876A10000000000000001.m_1', 'Keep-Alive': 'timeout=1, max=10', 'X-XSS-Protection': '1; mode=block', 'Content-Type': 'text/html', 'Transfer-Encoding': 'chunked', 'Date': 'Thu, 22 Jun 2017 09:01:30 GMT', 'Connection': 'Keep-Alive'}

print(response.content)
b'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n<html xmlns="&lt;a href=" http:="" www.w3.org="" 1999="" xhtml"="" rel="nofollow">http://www.w3.org/1999/xhtml">\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>\n<meta name="jig" content="ncbitoggler ncbiautocomplete"/>\n<meta name="ncbi_app" content="static"/>\n<meta name="ncbi_pdid" content="blastformatreq"/>\n<meta name="ncbi_stat" content="false"/>\n<meta name="ncbi_sessionid" content="5AAB86A694B876A1_0000SID"/>\n<meta name="ncbi_phid" content="5AAB86A694B876A10000000000000001"/>\nNCBI Blast\n<link rel="stylesheet" type="text/css" href="css/header.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/google-fonts.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/footer.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/main.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/common.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/blastReq.css" media="screen"/>\n\n<link rel="stylesheet" type="text/css" href="css/print.css" media="print"/>\n\n\n\n<script type="text/javascript" src="/core/jig/1.14.8/js/jig.min.js             "></script>   \n<script type="text/javascript" src="js/utils.js"></script>\n<script type="text/javascript" src="js/blast.js"></script>\n<script type="text/javascript" src="js/format.js"></script>\n\n</head>\n\n<body id="type-a">\n\n
\n\t\t \t\n

Most of it is cut off because of the character limit of this post.

This is unexpected, not? The response is not JSON and difficult to parse, it looks like it get a webpage back somehow.

Any suggestions?

Best regards,

Freek.

python python3 blast api • 4.7k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

How fixed are you on using JSON format for the response?
Have you considered using the Blast api from biopython?: http://biopython.org/DIST/docs/api/Bio.Blast-module.html

-> very easy to parse!

ADD COMMENT
0
Entering edit mode

Hi Gunnar, Thanx for your response.

Hmm, before asking to install such things on our compute cluster I prefer this minimal approach. I will investigate bio-python, still I would prefer minimal, self made, flexible code and an easy to parse JSON response for portability, if anybody can get it to work :)

I feel I'm missing a very small thing.

ADD REPLY
0
Entering edit mode

By the way, what ever FORMAT_TYPE I use I get the same html/website as a response.

ADD REPLY
0
Entering edit mode

Any tips on using Biopython then?

If I do this:

result_handle = NCBIWWW.qblast("blastn", "nt", 'gagtctcctttggaactctgcaggttctatttgctttttcccagatgagctctttttctggtgtttgtct')

I get some XML output, when I want to print result_handle again, it is empty! How to save result for example?

>

Never mind, I found this in the Biopython Cookbook:

"We need to be a bit careful since we can use result_handle.read() to read the BLAST output only once – calling result_handle.read() again returns an empty string."

I really don't understand the reason for this, it made me re-blast many, many times wondering what went wrong in the part of my script after the .read(). Anyway, thanx for the suggestion.

ADD REPLY

Login before adding your answer.

Traffic: 1382 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6