Question: Querying Ebi'S Picr Via Soap And Python With Multiple Accessions?
8
gravatar for Richard Llewellyn
3.7 years ago by
United States
Richard Llewellyn130 wrote:

Hello,

This is my first stop here. It looks surprisingly active!

I need to cross-ref protein coding genes from ncbi genomes to uniprot. After trying many other angles I now want to use PICR programmatically. This is my first reluctant exposure to SOAP and the python package suds. I found it straightforward to query for a single accession per request but need to do many at once for efficiency. Multiple accessions can be queried per request via SOAP, as the PICR REST instructions say:

"The methods available in the REST service are very similar to those available via SOAP, save for one major difference: only one accession or sequence can be mapped per request."

More likely than not I am missing something simple.

Here is a working example suds msg as a string (eg str(msg) ) :

my_test_str = '<SOAP-ENV:Envelope xmlns:ns0="&lt;a href=" http:="" www.ebi.ac.uk="" picr="" AccessionMappingService"="" rel="nofollow">http://www.ebi.ac.uk/picr/AccessionMappingService"   xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP  ENV="http://schemas.xmlsoap.org/soap/envelope/">\n <SOAP-ENV:Header/>\n <ns1:Body>\n   <ns0:getUPIForAccession>\n <ns0:accession>BAI16447</ns0:accession>\n <ns0:ac_version/>\n   <ns0:searchDatabases>TrEMBL</ns0:searchDatabases>\n   <ns0:searchDatabases>REFSEQ</ns0:searchDatabases>\n <ns0:taxonId/>\n <ns0:onlyActive/>\n </ns0:getUPIForAccession>\n </ns1:Body>\n</SOAP-ENV:Envelope>'

The wsdl is here: http://www.ebi.ac.uk/Tools/picr/service?wsdl

I've tried manipulating this string by hacking the suds SoapClient.send method directly, eg. by including a second getUPIForAccession element after the first in the body. The query runs without visible error but only returns the result for the first accession.

I've resorted to testing by hack because I haven't figured out how to tell suds I want to include more than one query per request. I think suds may not support this, as I've found this bit of advice in the Document Class source:

"""The document/literal style. Literal is the only (@use) supported since document/encoded is pretty much dead. Although the soap specification supports multiple documents within the soap <body/>, it is very uncommon. As such, suds presents an RPC view of service methods defined with a single document parameter. This is done so that the user can pass individual parameters instead of one, single document. To support the complete specification, service methods defined with multiple documents (multiple message parts), must present a document view for that method."""

I don't really grok all of that. But my direct hack bypasses the document class, so I should be getting multiple results if I have formed the request correctly.

Any happy picrs out there that can advise?

Thanks, Rich

ADD COMMENTlink written 3.7 years ago by Richard Llewellyn130
5
gravatar for Neilfws
3.7 years ago by
Neilfws41k
Sydney, Australia
Neilfws41k wrote:

Since the web page form allows submission of multiple IDs, perhaps you could hack something around that, using e.g. Mechanize (for Ruby, Python, Perl) ?

ADD COMMENTlink written 3.7 years ago by Neilfws41k

Now that sounds like old times. NCBI used to bar the lab occasionally for doing such things a bit too enthusiastically.... Does get their attention! With that in mind, I'll email PICR help and come back here and choose an answer once they reply.

ADD REPLYlink written 3.7 years ago by Richard Llewellyn130
4
gravatar for Michael Dondrup
3.7 years ago by
Bergen
Michael Dondrup27k wrote:

I didn't know this web-service, but looking at the WSDL I can surely tell, that what you are trying to accomplish is not possible with it. The query data type definition for a query is clear about this:

<element name="getUPIForAccession">

<complexType>

<sequence>
<element name="accession" type="xsd:string"/>
<element name="ac_version" type="xsd:string"/>
<element maxOccurs="unbounded" name="searchDatabases" type="xsd:string"/>
<element name="taxonId" type="xsd:string"/>
<element name="onlyActive" type="xsd:boolean"/>
</sequence>
</complexType>
</element>

Well, not totally obvious, because maxOccurs is not set for sequence, but defaults to 1. for example, if you try something like:

<soapenv:Envelope xmlns:soapenv="&lt;a href=" http:="" schemas.xmlsoap.org="" soap="" envelope="" "="" rel="nofollow">http://schemas.xmlsoap.org/soap/envelope/" xmlns:acc="http://www.ebi.ac.uk/picr/AccessionMappingService">
   <soapenv:Header/>
   <soapenv:Body>
      <acc:getUPIForAccession>
         <acc:accession>BAI16447</acc:accession>
    <acc:accession>BAI16448</acc:accession> 
         <acc:ac_version/>
         
         <acc:searchDatabases>TrEMBL</acc:searchDatabases>
         <acc:taxonId/>
         <acc:onlyActive>true</acc:onlyActive>
      </acc:getUPIForAccession>
   </soapenv:Body>
</soapenv:Envelope>

the message does not validate.

There is only a single query string possible per web service invocation. There is no way you can change that on your side, irrespective of the SOAP toolkit you are using. The only possibility is to invoke the service multiple time or use a different service, e.g. BioMart, as already recommende. If you were about to use BioMart in a programmatic way, I would recommend using its REST interface for the time being.

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Michael Dondrup27k
1

No, unfortunately not without changing the wsdl AND the service implementation. The 'hack' would result in an invalid request message document. If that worked (I guess it does not), it would actually be a bug in the service implementation and you couldn't rely on it. If the service accepts this request message without an error response (I guess it does), this is a bug in the service implementation. You can maybe nag the site admins in a kind fashion. I would guess they would respond to a kind email. If enough people request this, they could give it a go.

ADD REPLYlink written 3.7 years ago by Michael Dondrup27k

Thanks for examining this! I now understand the above won't work, but just to be clear, there is no way to repeat the entire getUPIForAccession with different accessions multiple times within the body?

ADD REPLYlink written 3.7 years ago by Richard Llewellyn130

thanks, am contacting picrs, hopefully in a kind fashion. They did advertise the ability to do multiple accessions per request but maybe that changed due to load....

ADD REPLYlink written 3.7 years ago by Richard Llewellyn130

PICR has replied: The documentation is, unfortunately, misleading. We had to change some core PICR functionality and the documentation was never updated accordingly. Currently, it is only possible to map one accession per call using either SOAP or REST interface. Thank you for raising this issue. We have made note of it and will update the documentation at the next release of PICR.

So, another strike against web services.

ADD REPLYlink written 3.7 years ago by Richard Llewellyn130
1
gravatar for Pierre Lindenbaum
3.7 years ago by
France
Pierre Lindenbaum58k wrote:

Update: I didn't fully read your question as said Michael, the interfaces described by the WSDL don't allow you to query more than one accession number per query.

Calling getUPIForAccession is not enough, once you get a UPEntry you have to call getUPIForAccession or/and getLogicalCrossReferences. I tested this with the java code below, it worked fine:

import java.util.ArrayList;
import java.util.List;

import uk.ac.ebi.picr.accessionmappingservice.AccessionMapperInterface;
import uk.ac.ebi.picr.accessionmappingservice.AccessionMapperService;
import uk.ac.ebi.picr.model.CrossReference;
import uk.ac.ebi.picr.model.UPEntry;

public class PCIR {
public static void main(String[] args) {
    try {
        AccessionMapperService service=new AccessionMapperService();
        AccessionMapperInterface mapper=service.getAccessionMapperPort();
        List<String> databases=new ArrayList<String>();
        databases.add("TREMBL");
        databases.add("SWISSPROT");
        for(UPEntry entry:mapper.getUPIForAccession("BAI16447",null,databases,null,true))
            {
            List<CrossReference> l2=entry.getIdenticalCrossReferences();
            if(l2!=null)
                {
                for(CrossReference cr:l2)
                    {
                    System.out.println("Identical:"+cr.getAccession());
                    }
                }
            l2=entry.getLogicalCrossReferences();
            if(l2!=null)
                {
                for(CrossReference cr:l2)
                    {
                    System.out.println("Logical:"+cr.getAccession());
                    }
                }
            }

    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

Result:

Identical:C7K3U3
Identical:C7JAN3
Identical:C7JTK1
Identical:C7KD12
Identical:C7KMC8
Identical:C7KWK1
Identical:C7JJD3
Identical:C7L5Q6
Logical:C7JTK1_ACEPA
Logical:C7JJD3_ACEPA
Logical:C7K3U3_ACEPA
Logical:C7KMC8_ACEPA
Logical:C7L5Q6_ACEPA
Logical:C7JAN3_ACEP3
Logical:C7KWK1_ACEPA
Logical:C7KD12_ACEPA
ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Pierre Lindenbaum58k

Thanks, yes, python suds was kind enough to make the second call for me and I didn't show it. Thanks for your helpful example.

ADD REPLYlink written 3.7 years ago by Richard Llewellyn130
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 379 users visited in the last hour