Question: Cannot post query a webserver using httplib and urlib in python
0
gravatar for mgab
4.7 years ago by
mgab50
Germany
mgab50 wrote:

I am trying to post query to a webserver : http://www.imtech.res.in/raghava/antibp/submit.html

but I am getting an error

Traceback (most recent call last):
  File "crawler.py", line 4, in <module>
    conn = httplib.HTTPConnection("http://www.imtech.res.in/raghava/antibp/submit.html")
  File "/usr/lib/python2.7/httplib.py", line 704, in __init__
    self._set_hostport(host, port)
  File "/usr/lib/python2.7/httplib.py", line 732, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: '//www.imtech.res.in/raghava/antibp/submit.html'

The python script is shown below:

import httplib, urllib
params = urllib.urlencode({'seqname':'GICACRRRFCPNSERFSGYCRVNGARYVRCCSRR','format':'Amino acid sequence in single letter code', 'terminus':'N-terminus', 'method':'svm', 'svm_th':'0', 'type': 'Submit'})
headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"}
conn = httplib.HTTPConnection("http://www.imtech.res.in/raghava/antibp/submit.html")
conn.request("POST", "", params, headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
conn.close()

What could be the problem? Thank you.

python • 2.7k views
ADD COMMENTlink modified 4.7 years ago by Saulius Lukauskas530 • written 4.7 years ago by mgab50
3
gravatar for dariober
4.7 years ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

That's how I would do it, with the disclaimer that I'm no expert in querying web pages and I don't know anything about the server in question:

python
import mechanize

br = mechanize.Browser()
br.set_handle_robots(False)
br.open("http://www.imtech.res.in/raghava/antibp/submit.html")
br.select_form(nr = 0)

## See what is available on this web page:
for f in br.forms():
    print f

#<POST http://www.imtech.res.in/cgibin/antibp/antibp1.pl multipart/form-data
#  <TextControl(seqname=)>
#  <TextareaControl(seq=)>
#  <FileControl(file=<No files added>)>
#  <SelectControl(format=[*nformat, sformat])>
#  <RadioControl(terminus=[*1, 2, 3])>
#  <RadioControl(method=[*1, 2, 3])>
#  <TextControl(svm_th=0)>
#  <TextControl(ann_th=0.6)>
#  <TextControl(qm_th=-0.2)>
#  <SubmitControl(<None>=Submit) (readonly)>
#  <IgnoreControl(<None>=<None>)>>

## Input your sequence and parameters:
br['seqname']= 'myseq'
br['seq']= 'GICACRRRFCPNSERFSGYCRVNGARYVRCCSRR'
br['format']= ['nformat']
br['terminus']= ['1']
br['svm_th']= '0'

## Sumbit and collect results:
res= br.submit()
html= res.read()

Now html is string in html format that you could parse with an html parser or something else. The relevant bit in html should look like:

<td><font size="4"><b>Antibacterial Activiy</b></font></td></tr><tr>
<td align="CENTER">GICACRRRFCPNSER</td><td align="CENTER">1</td><td align="CENTER">1.975</td><td align="CENTER">YES</td></tr><tr>
<td align="CENTER">GYCRVNGARYVRCCS</td><td align="CENTER">18</td><td align="CENTER">1.051</td><td align="CENTER">YES</td></tr><tr>
<td align="CENTER">ICACRRRFCPNSERF</td><td align="CENTER">2</td><td align="CENTER">1.001</td><td align="CENTER">YES</td></tr><tr>
...

 

ADD COMMENTlink written 4.7 years ago by dariober11k
2
gravatar for Saulius Lukauskas
4.7 years ago by
London, UK
Saulius Lukauskas530 wrote:

Don't use httplib (and other native libraries) directly. If you want to stay sane that is.

Have a look at requests library instead, I bet you will be able to code your request just by reading the first page of documentation.

ADD COMMENTlink written 4.7 years ago by Saulius Lukauskas530
0
gravatar for RamRS
4.7 years ago by
RamRS25k
Houston, TX
RamRS25k wrote:

Try omitting the http:// part in the URL you supply to httplib.HTTPConnection(). The method seems to split by : and use the part after it as the port number.

ADD COMMENTlink written 4.7 years ago by RamRS25k
0
gravatar for mgab
4.7 years ago by
mgab50
Germany
mgab50 wrote:
I have made changes, that is

 conn = httplib.HTTPConnection("www.imtech.res.in/raghava/antibp/submit.html")

but I am getting the error:

socket.gaierror: [Errno -2] Name or service not known

 

Please assist.

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by mgab50
0
gravatar for Devon Ryan
4.7 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

This is really more of a python question than a bioinformatics one.

Only the server name should be included in HTTPConnection():

conn = httplib.HTTPConnection("www.imtech.res.in")
conn.request("POST", "/raghava/antibp/submit.html", params, headers)

I've not tested that, but it's at least closer to being correct.

ADD COMMENTlink written 4.7 years ago by Devon Ryan94k
0
gravatar for mgab
4.7 years ago by
mgab50
Germany
mgab50 wrote:

I have done this but it is leading to the submit.html. I have made changes to

conn = httplib.HTTPConnection("www.imtech.res.in")
conn.request("POST", "/cgibin/antibp/antibp1.pl", params, headers)

but still, it is not working.

 

 

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by mgab50

1. "Still not working" isn't something that anyone can help you with.

2. Try a python forum.

ADD REPLYlink written 4.7 years ago by Devon Ryan94k
0
gravatar for mgab
4.7 years ago by
mgab50
Germany
mgab50 wrote:

Excellent Dariober. It is working perfectly. Unbelievable. Thank you very much.

ADD COMMENTlink written 4.7 years ago by mgab50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1037 users visited in the last hour