How to get Taxonomy and Taxon ID for an Accession number using Python?
1
0
Entering edit mode
21 months ago
pramirez ▴ 10

Hi! I am writing a python script to obtain the taxonomic classification to the phylum level of some protein sequences. I currently have the NCBI accession numbers (see question Retrieving taxonomy from entrez search in biopython). I realized that I need the taxon ID to search it in the taxonomy database to obtain the information about the phylum. Do you know how to do this in Python?

Many thanks!

python biopython entrez • 735 views
ADD COMMENT
1
Entering edit mode
21 months ago
GenoMax 141k

Using EntrezDirect (you will need to translate this to biopython):

Following answers will work for both of your questions.

$ esearch -db protein -query OGI11933 | elink -target taxonomy | efetch -format native -mode xml | xtract -pattern Taxon -block "*/Taxon" -unless Rank -equals "no rank" -tab "\n" -element Rank,ScientificName,TaxId
superkingdom    Archaea 2157
clade   DPANN group     1783276
phylum  Candidatus Micrarchaeota        1801631

$ esearch -db protein -query OGI11933 | elink -target taxonomy | efetch -format native -mode xml | xtract -pattern Taxon -block "*/Taxon" -if Rank -equals "phylum" -element ScientificName
Candidatus Micrarchaeota
ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2687 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6