Why Posting Data To Plantcare With Urllib,Urllib2 In Python Returns 'Http Error 400: Bad Request'
1
3
Entering edit mode
13.0 years ago
Gahoo ▴ 270

I was tring to write a python script to get data from PlantCARE. But It alway return 'HTTP Error 400: Bad Request'. urllib and urllib2 works fine with other websites.

Here're the codes.

import urllib2, urllib
url="http://bioinformatics.psb.ugent.be/webtools/plantcare/cgi-bin/CallMat_IE55.htpl"
params={
    'Field_Sequence':'CTAATCTTATGCATTTAGCAGTACAAATTCAAAAATTTCCCATTTTTATTCATGAATCATACCATTATATATTAACTAAATCCAAGGTAAAAAAAAGGTATGAAAGCTCTATAGTAAGTAAAATATAAATTCCCCATAAGGAAAGGGCCAAGTCCACCAGGCAAGTAAAATGAGCAAGCACCACTCCACCATCACACAATTTCACTCATAGATAACGATAAGATTCATGGAATTATCTTCCACGTGGCATTATTCCAGCGGTTCAAGCCGATAAGGGTCTCAACACCTCTCCTTAGGCCTTTGTGGCCGTTACCAAGTAAAATTAACCTCACACATATCCACACTCAAAATCCAACGGTGTAGATCCTAGTCCACTTGAATCTCATGTATCCTAGACCCTCCGATCACTCCAAAGCTTGTTCTCATTGTTGTTATCATTATATATAGATGACCAAAGCACTAGACCAAACCTCAGTCACACAAAGAGTAAAGAAGAACAA',
    'Field_SequenceName':'demo',
    'Field_SequenceDate':'4.27',
    'Mode':'readonly',
    'StartAt':'0',
    'NbRecs':'10',
    'MatInspector':'Search'
    }
data=urllib.urlencode(params)
print urllib2.urlopen(url, data).read()

But I can get the result page directly with curl. It's wierd! Open this link in any browser should be the same ,and it works fine.

curl "http://bioinformatics.psb.ugent.be/webtools/plantcare/cgi-bin/CallMat_IE55.htpl?Mode=readonly&StartAt=0&Field_Sequence=CTAATCTTATGCATTTAGCAGTACAAATTCAAAAATTTCCCATTTTTATTCATGAATCATACCATTATATATTAACTAAATCCAAGGTAAAAAAAAGGTATGAAAGCTCTATAGTAAGTAAAATATAAATTCCCCATAAGGAAAGGGCCAAGTCCACCAGGCAAGTAAAATGAGCAAGCACCACTCCACCATCACACAATTTCACTCATAGATAACGATAAGATTCATGGAATTATCTTCCACGTGGCATTATTCCAGCGGTTCAAGCCGATAAGGGTCTCAACACCTCTCCTTAGGCCTTTGTGGCCGTTACCAAGTAAAATTAACCTCACACATATCCACACTCAAAATCCAACGGTGTAGATCCTAGTCCACTTGAATCTCATGTATCCTAGACCCTCCGATCACTCCAAAGCTTGTTCTCATTGTTGTTATCATTATATATAGATGACCAAAGCACTAGACCAAACCTCAGTCACACAAAGAGTAAAGAAGAACAA&Field_SequenceName=Sequence+test&Field_SequenceDate=4.27&NbRecs=10&MatInspector=Search"
python • 8.0k views
ADD COMMENT
7
Entering edit mode
13.0 years ago
Bio_X2Y ★ 4.4k

It seems that the Accept header is needed, but I don't know why.

Making the following changes to your code seems to work for me:

data = urllib.urlencode(params)
headers = {"Accept" : "*/*"}
req = urllib2.Request(url, data, headers)
print urllib2.urlopen(req).read()
ADD COMMENT
0
Entering edit mode

Thanks a lot, it's working now! I capture the package sent by firefox4 with WireShark. The Accpet header is below:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8

Would it be helpful to find out the reason?

ADD REPLY
0
Entering edit mode

I wouldn't worry about the reason too much, most likely a server setup that tries to ward off certain type of requests, props to Bio_X2Y for finding it out though - an error like this would that would have stumped me too for ages

ADD REPLY

Login before adding your answer.

Traffic: 3097 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6