Question: How to retrieve any and all NCBI/GenBank accession numbers from a Taxonomy ID?
0
gravatar for yarmda
2.8 years ago by
yarmda0
yarmda0 wrote:

I want to supply a taxID for any level of phylogeny and retrieve all of the accession numbers for organisms that fit. For example, a taxID of 1063 is species-level Rhodobacter sphaeroides and has around 7 strains. Is it possible to use efetch to retrieve the accession numbers for all of their genomes?

Retrieving the taxID from an accession number is straightforward with: curl "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=acc_number&rettype=fasta&retmode=xml"

Granted, there's some grepping after the data comes back, but that's fine. I'm looking for something similar that will give back every accession number associated with the clade's tax ID.

Ideally, I would be able to include a taxID query into the eutils/efetch I have above. Is it possible to query by one of the fields returned by the above?

Since the above curl brings back data that includes taxID, could I query the nuccore database by the taxID instead of the accession number?

Does that make sense?

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by yarmda0

I did not find an automated solution to this, yet. I have resolved to download accession numbers from the NCBI site manually. Since I'm only after a handful of unchanging targets, this will suit my needs for now.

ADD REPLYlink written 2.8 years ago by yarmda0
0
gravatar for genomax
2.8 years ago by
genomax70k
United States
genomax70k wrote:

See my answer in this post: Automatically Accessing all the sequences of a given order?
Since you want accession numbers add step 4a: Under "Summary" on left side of the page choose "Format" --> "Accession list".

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by genomax70k

Thanks for this! While this is a solution, I'm trying to keep everything automated in a single script - so I don't think this is quite the solution I want.

ADD REPLYlink written 2.8 years ago by yarmda0
0
gravatar for Prasad
2.8 years ago by
Prasad1.6k
India
Prasad1.6k wrote:

have you tried elink?

here is the example output for taxid you have mentioned

ADD COMMENTlink written 2.8 years ago by Prasad1.6k

What do the IDs in the output represent?

ADD REPLYlink written 2.8 years ago by yarmda0

gi ids for all the entries for that particular taxaid in NCBI nucleotide database. You can change the database name accordingly, see here.

ADD REPLYlink written 2.8 years ago by Prasad1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1096 users visited in the last hour