I am trying to fetch genbank files from a list of given accession ids, which are stored in a file, by using biopython. This is how I do it so far:
#!/usr/bin/env python from sys import argv, stdout, exit from Bio import SeqIO from Bio import Entrez Entrez.email@example.com' def searchInDb(searchFor): handle = Entrez.efetch(db='nucleotide', id=searchFor, rettype='gb') link = searchFor + ".gb" local_file = open(link, 'w') local_file.write(handle.read()) handle.close() local_file.close() if __name__ == '__main__': if len(argv) != 2: print '\tmissing file link' exit(1) name = argv with open(name, "r") as ins: for line in ins: ID = line.rstrip('\n') print "Getting gb file for ", ID searchInDb(ID)
However when I do it like this and later take a look at the .gb file, it is not complete, I dont have any information about the CDS or anything, but I need exactly those because later I want to parse from the gb file the gene_locus_tags as well as the position of the CDS on the genome and so on.
Does someone know how do I need to change my code so I achieve getting the complete .gb file??