Question: How to retrieve all fasta sequences using Assembly/BioProject ID using Entrez Programming Utilities
0
gravatar for fengzys
3.4 years ago by
fengzys50
fengzys50 wrote:

I am trying to download all the viral and bacterial genome (in Genome database), I have used Entrez utilities. Firstly, Esearch was use to retrieve all the viral Genome UID, which was then translated to nuccore gi number by Elink, some gi number corresponds to the parental description of a WGS projects, thus the fasta sequence can not obtained by efetch directly, by parsing the gb output of these gi, I can get the accession number, but this is very tedious. Is there a way to get all the sequences belongs to a Assembly or Bioproject? (Elink could translate genome UID to Assembly or BioProject ID). Thanks for your time.

utilities ncbi efetch • 1.7k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by fengzys50
0
gravatar for wpwupingwp
3.4 years ago by
wpwupingwp110
China
wpwupingwp110 wrote:

http://www.ncbi.nlm.nih.gov/sites/batchentrez

try this

ADD COMMENTlink written 3.4 years ago by wpwupingwp110
0
gravatar for fengzys
3.4 years ago by
fengzys50
fengzys50 wrote:

Thanks, however, I found batchentrez can not deal with these master record as well. e.g. genome id 44851 44849 44848 or assembly 739291 738621 194461 or gi 1027168540 1027763052 738803256

ADD COMMENTlink written 3.4 years ago by fengzys50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2289 users visited in the last hour