Question: How to specify columns in Uniprot batch query from Uniprot IDs?
1
gravatar for Solowars
15 months ago by
Solowars50
Brazil/Porto Alegre/UFRGS
Solowars50 wrote:

Dear all,

My question is relatively similar to How To Programmatically Retrieve A Batch Of Fasta Sequences From For A List Of Uniprot Accession Ids?, in which someone tries to retrieve fasta sequences from Uniprot from a list of Uniprot Protein IDs.

In my case, I also have a list of Uniprot IDs, and I tried the perl script of the example (see link above). Being a complete perl layman, I managed to retrieve the main result table in Excel. However, I would like to add some more columns (i.e. sequences), something possible to do in the web service. Is there an easy way to modify that perl code example in order to specify the columns I want?

A related question: Is it possible to retrieve the same information but using R instead? I found a couple Uniprot-related packages in Bioconductor, but by reading the manuals I don't think they can do the trick...

Thank you very much!

batch uniprot proteinid perl • 748 views
ADD COMMENTlink modified 15 months ago by Elisabeth Gasteiger1.6k • written 15 months ago by Solowars50
1

in java : use the XML schema for uniprot to generate code. see How To Retrieve Human Proteins Sequence Containing A Given Domain ; Finding Single Domain Proteins

ADD REPLYlink written 15 months ago by Pierre Lindenbaum120k

Hi Pierre! I'll give it a try, even though I'm afraid that my knowledge of Java is even more limited than Perl... Thanks!

ADD REPLYlink written 15 months ago by Solowars50
2
gravatar for Elisabeth Gasteiger
15 months ago by
Geneva
Elisabeth Gasteiger1.6k wrote:

You can try something like this below - and note that the column names are documented at https://www.uniprot.org/help/uniprotkb_column_names :

use strict;
use warnings;
use LWP::UserAgent;

my $list = $ARGV[0]; # File containg list of UniProt identifiers.

my $base = 'http://www.uniprot.org';
my $tool = 'uploadlists';

my $contact = ''; # Please set your email address here to help us debug in case of problems.
my $agent = LWP::UserAgent->new(agent => "libwww-perl $contact");
push @{$agent->requests_redirectable}, 'POST';

my $response = $agent->post("$base/$tool/",
   [ 'file' => [$list],
     'format' => 'tab',
     'columns'=> 'id,protein_names,genes,length',
     'from' => 'ACC+ID',
     'to' => 'ACC',
   ],
   'Content_Type' => 'form-data');

while (my $wait = $response->header('Retry-After')) {
    print STDERR "Waiting ($wait)...\n";
    sleep $wait;
    $response = $agent->get($response->base);
}
ADD COMMENTlink written 15 months ago by Elisabeth Gasteiger1.6k

It worked! For some reason it wasn't working at first, so I tried to add the example chunk:

    $response->is_success ?
  print $response->content :
  die 'Failed, got ' . $response->status_line .
    ' for ' . $response->request->uri . "\n";

and it worked like a charm!

ADD REPLYlink modified 15 months ago • written 15 months ago by Solowars50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 844 users visited in the last hour