Getting Taxonomy Lineage From Ncbi Gi Or Accession Number
2
5
Entering edit mode
11.7 years ago
Abhi ★ 1.6k

Hey Guys

I have a specific question about getting a taxonomic lineage, given a NCBI accession or GI number. I have in the past used the module from CPAN (below) but I belv it only works for GI's and not NCBI Accession. Efecthing using NCBI etuils is not an option as we need to make millions of queries/weeks for blast results.

Bio::LITE::Taxonomy::NCBI link

Just wondering if there are other ways to this with no dependency on fetching stuff from internet.

blast taxonomy • 29k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
2
Entering edit mode

@Pierre : with due respect the two posts are not exactly same. I was looking for NCBI GI to taxonomy lineage i.e Superkingdom:Kingdon:Phylum:Class etc and not necessarily just NCBI tax id.

ADD REPLY
4
Entering edit mode
11.7 years ago
Neilfws 49k

For "millions of queries", you will want a local copy of the taxonomy database.

Start with the FTP site and consult the readme files to see which is appropriate for you. One option is to load the taxonomy into a BioSQL instance as described here.

ADD COMMENT
0
Entering edit mode

I guess thats what I will eventually do....I was just wondering if someone has already built something similar that I could use.

ADD REPLY
0
Entering edit mode
ADD REPLY
4
Entering edit mode
7.2 years ago
-_- ★ 1.1k

First, you need to map accession numbers (GI is deprecated) to tax ids based on *accession2taxid.gz files from here, ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_nucl.dmp.gz. Then based on a tax id, you can trace its whole lineage.

The whole NCBI taxonomy database is not that big. I have written some code to convert NCBI taxdump into lineages identified by tax ids, https://github.com/zyxue/ncbitax2lin. You may find it useful.

ADD COMMENT

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6