I want to download all EST sequences from Genbank that are in the Order Hymenoptera.
This is easy by pointing and clicking: http://www.ncbi.nlm.nih.gov/sites/entrez?term=Hymenoptera&cmd=Search&db=nucest
However, I cannot get the efetch/eutils syntax right to do this in the commandline.
Note that contrarily to the following question, I do not know the identifiers for these ESTs: How To Retrive The Dna Sequence From A List Of Embl And Geneid
Thanks!
yannick
or you can put more than one id with the same parameter 'id':
Merci, Pierre! Your response implies that one cannot avoid calling efetch 300,000 times? I'm a bit scared NCBI will think I'm running a denial-of-service attack!!
Also, is it possible to get eutils output in .txt rather than xml?
no, you can also use the parameter 'usehistory' in esearch (http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#History), this will give your a 'WebEnv' that you'll later use with one and only efetch query
excellent, thanks!
And bioruby provides a great wrapper. http://bioruby.org/rdoc/classes/Bio/NCBI/REST.html