Question: fetch -complete- genbank file using biopython
1
gravatar for beginner_problem
3.3 years ago by
beginner_problem10 wrote:

I am trying to fetch genbank files from a list of given accession ids, which are stored in a file, by using biopython. This is how I do it so far:

#!/usr/bin/env python

from sys import argv, stdout, exit
from Bio import SeqIO
from Bio import Entrez

Entrez.email='example@mail.com'

def searchInDb(searchFor):

handle = Entrez.efetch(db='nucleotide', id=searchFor, rettype='gb')

link = searchFor + ".gb"
local_file = open(link, 'w')
local_file.write(handle.read())
handle.close()
local_file.close()

if __name__ == '__main__':
if len(argv) != 2:
    print '\tmissing file link'
    exit(1)
name = argv[1]

with open(name, "r") as ins:
    for line in ins:
        ID = line.rstrip('\n')
        print "Getting gb file for ", ID
        searchInDb(ID)

However when I do it like this and later take a look at the .gb file, it is not complete, I dont have any information about the CDS or anything, but I need exactly those because later I want to parse from the gb file the gene_locus_tags as well as the position of the CDS on the genome and so on.

Does someone know how do I need to change my code so I achieve getting the complete .gb file??

genome • 2.9k views
ADD COMMENTlink written 3.3 years ago by beginner_problem10
1
gravatar for Pierre Lindenbaum
3.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

it is not complete, I dont have any information about the CDS or anything,

Give us some examples of accession numbers. Furthermore, not all sequences have those informations.

ADD COMMENTlink written 3.3 years ago by Pierre Lindenbaum134k

Yes you are right. But when I manually download the gb files for my accessions, I have the complete file, so that is why I guessed my code is wrong. Taking for example this one: NC_021485, with my code the .gb file is not complete

ADD REPLYlink written 3.3 years ago by beginner_problem10
1

use rettype=gbwithparts

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=NC_021485&retmode=xml&rettype=gbwithparts

however, I'ts genbank/text don't know how to retrieve the XML output.

ADD REPLYlink written 3.3 years ago by Pierre Lindenbaum134k

Yes, I tried it, and it works so far. thanks.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by beginner_problem10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 993 users visited in the last hour
_