use LWP::Simple;
use URI::URL;
if(@ARGV != 3){
print "Usage: perl test.pl < database > < id > < your e-mail >\n";
exit(0);}$database=$ARGV[0];$id=$ARGV[1];$email=$ARGV[2];$address="http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi";$parameter={"db"=>$database,
"id"=>$id,
"retmode"=>"text",
"rettype"=>"gp",
"email"=>$email};$url= url($address);$url->query_form($parameter);$result= get($url);
print $result;
But this is possible for a single id at a time and gives me a lot more information. How can I upload a list and retrieve only the aa sequence store the results in a file ?
I know your question is answered, but I thought I would post this script for others interested in doing iterative sequence retrievals. You can try this BioPerl E-utils script. This example is the same as searching for 'crab' in the protein database, but it will save all sequences it finds.
########## <http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook> ##########!/usr/bin/perl -w
BEGIN {push @INC,"path/to/BioPerl";}
use Bio::DB::EUtilities;# set optional history queue
my $factory= Bio::DB::EUtilities->new(-eutil =>'esearch',
-email =>'mymail@foo.bar',
-db =>'protein',
-term =>'crab',
-usehistory =>'y');
my $count=$factory->get_count;# get history from queue
my $hist=$factory->next_History || die 'No history data returned';
print "History returned\n";# note db carries over from above$factory->set_parameters(-eutil =>'efetch',
-rettype =>'fasta',
-history =>$hist);
my $retry= 0;
my ($retmax, $retstart)=(500,0);
open (my $out, '>', 'lots_of_crab_sequences.fa')|| die "Can't open file:$!";
RETRIEVE_SEQS:
while($retstart<$count){$factory->set_parameters(-retmax =>$retmax,
-retstart =>$retstart);
eval{$factory->get_Response(-cb => sub {my ($data)= @_; print $out$data});};if($@){
die "Server error: $@. Try again later"if$retry== 5;
print STDERR "Server error, redo #$retry\n";$retry++ && redo RETRIEVE_SEQS;}#say "Retrieved $retstart";$retstart +=$retmax;}
close $out;
Thanks a lot Pierre, this has worked for me!