Question: Entrez.efetch HTTP Error 400: Bad request
1
gravatar for wanderingstefan
22 months ago by
wanderingstefan30 wrote:

Hey everyone,

I have a weird problem. I am trying to download a number of assemblies from Genbank using Entrez.eftech. Here is my code:

from Bio import Entrez, SeqIO
import csv, sys, os, time, shutil
import httplib, urllib2

Entrez.email="mymail@gmail.com"

def download_genomes():

    #Search for all bacterial assemblies in the assembly database and get their ids
    search_term= "bacteria[orgn] AND all[filter]"
    handle=Entrez.esearch(db="assembly", retmax=500000, term=search_term)
    genome_id=Entrez.read(handle)['IdList']
    print "Fetched Id list..."

    for id in genome_id:

        while True:
            try:
                #Fetch the enrty corresponding to the id
                record=Entrez.efetch(db='assembly', id=id, rettype='fasta', retmode='text')
                time.sleep(3)
                seq_record=Entrez.efetch(db='assembly', id=id, rettype='gbwithparts', retmode='text')
                seq_meta=SeqIO.read(seq_record, "genbank")
                .            
                .  
                .
                #Skipped rest of the code which writes downloaded files to spec. dirs and so on

However, when running this, get a HTTP Error 400: Bad request for every id. When trying out the ids manually in Genbank, the enrty is found. Does somebody know what could be going on here? I would apprechiate the help!

Cheers!

ADD COMMENTlink modified 22 months ago • written 22 months ago by wanderingstefan30

NCBI moved to exclusive https connections last year. Are you using the latest version of all modules?

ADD REPLYlink written 22 months ago by genomax73k

Yes that sounds like a very good explanation. I don't think your code is wrong, 400 error's usually have something to do with the server (wrong URL, moved resource, etc) ....

ADD REPLYlink written 22 months ago by LLTommy1.2k

Any idea what could be wrong? I can find the genome in question with the id fetched by entrez, so I don't think the resource itself was moved...

ADD REPLYlink modified 22 months ago • written 22 months ago by wanderingstefan30

Hej, thanks for the fast reply! I am using biopython 1.69, which is the latest version supported by anaconda to my knowledge...

ADD REPLYlink written 22 months ago by wanderingstefan30

Did it work recently, e.g. yesterday and just suddenly stopped working? If so .... maybe they just have a temporal problem with a server or something, that can happen as well.

ADD REPLYlink written 22 months ago by LLTommy1.2k
0
gravatar for wanderingstefan
22 months ago by
wanderingstefan30 wrote:

I did not get it to work using entrez, so I just wrote another script that uses wget to get the sequences via https (based on the instructions on https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/#downloadservice , in the section 'To use HTTPS'). It works only properly when I set a time.sleep() statement behind every request, but it now seens to work.

Thanks for the help!

Cheers

ADD COMMENTlink written 22 months ago by wanderingstefan30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2259 users visited in the last hour