Question: Eutils Request Returns Random Results
3
gravatar for charles.hebert
7.0 years ago by
United Kingdom
charles.hebert50 wrote:

Dear community,

I'm trying to download a small Entrez dataset using eutils. I use three strategies (i.e. libraries) to do that : bioperl, biopython and requests python library.

Unfortunately, the download is not robust : the result is often empty (the request status code is 200, but the XML contains an error).

<eSummaryResult>
  <ERROR>Unable to obtain query #1</ERROR>
</eSummaryResult>

I use an epost command with a small set of GI (10 - 100). Webenv - query_key - retstart - retmax attributes are used to build an Epost or Esummary request.

If I paste the dynamic URL in my favorite browser, it works ! Relaunch the code, it works ! Relaunch the code, error !

I'm really disappointed... Have you any idea about what I missed ?

Thanks

perl python entrez eutils • 2.6k views
ADD COMMENTlink modified 6.9 years ago by matthiassamwald30 • written 7.0 years ago by charles.hebert50
2

You will have to show your code if you want a helpful response. It is not possible to diagnose your issues otherwise.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by SES8.3k

The code is quite simple. Using biopython (as described in the tutorial)

Get webenv / query key using Entrez.epost(db="nuccore", id="417075336,407894523")

for(start=1;start< ids list size; batchsize):
     Entrez.esummary(db="nuccore", webenv=X, query_key=Y, retstart=start, retmax=batchsize, ...)

This behavior occurs randomly, even if I manually build the request (using urllib or requests) and parse the XML.

(nota : sleep of 1 second between each query) Maybe an eutils bugs ?

ADD REPLYlink modified 7.0 years ago by Istvan Albert ♦♦ 82k • written 7.0 years ago by charles.hebert50
1

that does not look like Python code.

ADD REPLYlink written 7.0 years ago by Istvan Albert ♦♦ 82k

Pseudo code + the biopython Entrez.esummary call.

ADD REPLYlink written 7.0 years ago by charles.hebert50
1

I don't know which tutorial that is, but it is not the Biopython tutorial, which contains lots of useful EUtils code examples.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by Neilfws48k

I use the epost / esummary command. As I said before, the request is ok (code 200) but randomly contains an "Unable to obtain query #1". This behavior occurs randomly and the bug is not linked to a specific library (bioperl / biopython / requests / direct use of urllib2...). If I paste the same URL in a browser, it works.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by charles.hebert50
3
gravatar for matthiassamwald
6.9 years ago by
matthiassamwald30 wrote:

I just want to add that I experience the same problem with eUtils (in my case, I am using a PHP script, but the bug is indeed independent of the implementation). I ran the same script last year, and it worked without these errors, so the problem seems to be caused by the eUtils server.

ADD COMMENTlink written 6.9 years ago by matthiassamwald30

After more researching, I found that the most common solution to the problem is to simply do automated re-trys until the eUtils server responds properly again. That seems to work.

ADD REPLYlink written 6.9 years ago by matthiassamwald30

It is also important to note that NCBI requests user to not make more than 3 request per second. See here: http://www.ncbi.nlm.nih.gov/books/NBK25497/ From my expirence it seems that you will get random errors if you do not comply with this rule.

ADD REPLYlink written 6.9 years ago by lelle820
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 905 users visited in the last hour