Hello, I am a beginner in bioinformatics. I have to get all the sequences of the klebsiella genome from ncbi. I have to use biopython for my internship. except that I absolutely need the number that there is in the link ( https://www.ncbi.nlm.nih.gov/genome/browse/#!/prokaryotes/815/ ) from "Genome Assembly and Annotation report (10703)" so I recovered the identifiers from " https://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/prokaryotes.txt " and I tried to make a script in biopython that allows me to recover the sequences. But the script doesn't work, I guess the identifiers on ftp and on the nucleotide database are not the same. I would like to know if there is some kind of correspondence between the nuccore(nucleotide) identifiers and the ones on ftp.
I have looked at https://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/IDS/Bacteria.ids and there are only 31 identifiers and not all of them. Thanks a lot. Here is the biopython code:
from Bio import SeqIO from Bio import Entrez list_id =  file = open("listId.txt", "r") readlineFile=file.readline() print(readline) for line in file: file.readline() list_id.append(line) print(List_id) fic_seq = Entrez.efetch(db="nucleotide", id="list_id", rettype="gb") my_seq=SeqIO.parse(fic_seq,"gb") for seq in my_seq : print (seq) my_seq=SeqIo.parse(fic_seq,"gb") SeqIO.write(my_seq, "out.fasta", "fasta") fic_seq.close()