Entering edit mode
3.3 years ago
izhang
•
0
I have a list of protein accession IDs from the NCBI nr database that look like this:
WP_0445013 WP_1884344 TBR13838
These are all bacterial proteins from a range of different bacteria, and I've made a phylogenetic tree based on these proteins. However, the tree annotations are these labels and I want to annotate it with the taxonomy instead. I'm not very familiar with the Entrez system but is there an easy way to replace these accession IDs with the taxonomy of the sequence, such as genus and species names?
Any help is appreciated, thanks!
You could do something like following using EntrezDirect:
though the examples numbers you posted don't seem to be correct.
WP
accessions refer to multiple organisms so keep that in mind.Thank you, that works! It seems like my Phylip conversion program truncated some of the accession numbers. I retrieved these proteins from NCBI nr, but is there a place I can download the entire set of complete, annotated bacterial genomes? I'm trying to look at the evolution of a widespread metabolic pathway across all/as many bacteria as possible.