Trouble Submitting Forms on Duet for Protein Stability using Web scraping technique (Requests and BeautifulSoup Packages)
0
0
Entering edit mode
13 months ago

I'm trying to check the stability of a set of proteins using two tools: FoldX and Duet. FoldX is an offline tool, while Duet is available online. I'm writing a script to interact with Duet, where the script will scrape the results based on the queries I send. However, I've run into an issue where the form on Duet's website seems incomplete. When I try to submit the form, I get an error saying the form action is invalid. How can I resolve this issue?

#!/usr/bin/python3

import os
import requests
from bs4 import BeautifulSoup as bs

pdbFile, mutation, fChain = "4fdi.pdb", "E65W", "A"
url = "https://biosig.lab.uq.edu.au/duet/stability"
response = requests.get(url)
soup = bs(response.content, 'html.parser')
form = soup.find_all('form')[1]
files = {
    "wild": open(pdbFile, 'rb')
}
form_data = {
    'mutation': mutation,
    'chain': fChain,
    'run':'single'
}
response_2 = requests.post(form['action'], files=files, data=form_data)

Error which I am encountering:

MissingSchema: Invalid URL '/duet/stability_prediction': No scheme supplied. Perhaps you meant http:///duet/stability_prediction?
web-scraping web-service protein python • 852 views
ADD COMMENT
0
Entering edit mode

If I were you, I'd get explicit permission from the organization before possibly abusing their service the way you're doing it. Anyway, to debug, first find out which line is causing the error - the requests.get or the requests.post.

ADD REPLY
0
Entering edit mode

I'm not abusing the organization's services. However, If your look at the code, I've used request.get to retrieve the Duet homepage where the query can be submitted. If you have clearly saw the Post title that I am having trouble in submitting the form which means form method is request.post. So, definitely, I'll be getting error from the request.post. Additionally, Duet is free for academic use.

ADD REPLY
0
Entering edit mode

I'm not abusing the organization's services.

The organization should say that, not you. Like I said, check with them if you can use programmatic POST requests to submit multiple queries. If an organization were to explicitly allow programmatic access, they would ideally expose an API. Also, not a lot of places offer POST requests through an API - it's mostly GET.

Additionally, Duet is free for academic use.

Regular use, not programmatic POST requests.

ADD REPLY

Login before adding your answer.

Traffic: 4191 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6