I am trying to download a large amount of NCBI entries from a large set of accession numbers returned by a blastp search. With my list of accession numbers, I am looking for their taxonomy, bacteria, or virus, which is available when using batch entrez in protein search and downloading the returned list in GenPept format. Where I am having an issue is that some of these numbers are flagged by batch entrez as, Id=MBG9901843.1: protein: Wrong UID MBG9901843.1 which is about 230 out of a 2500 set of accession numbers. Those that are flagged in this manner do get returned when using the Identical Protein Group entrez option, however, I would have to switch the format to Genpept one by one and download them one by one. Is there a simpler way to obtain the taxonomy of an accession number from a list of accessions? I would simply ignore those that get flagged as the Wrong UID however it cuts out too much of the data for my liking. I know this is a bit of a long-winded explanation and I'll be glad to clarify any aspect.
Thanks you very mmuch