I have fasta files of different genomes of bacteria taken from the NCBI RefSeq database. I want to get the annotation of these genomes as the ones that can be shown in the genbank file format. What I mean by annotation is cds (gene start/end positions, description, and others). Anyway, I want to extract cds (nucleotide sequence) that have title/description of prophages.
Maybe it is not an exact answer for my question, but as a turn around, what I needed doing is downloading the bacteria genomes that I need from the NCBI RefSeq database as a genbank files. Then, it's easy to process these genbank files using Biopython library to get CDs, etc.