Question: Using UniProt entry names to retrieve UniProtEntry data (ID and sequence) using UniProt JAPI (Java)
1
gravatar for koenrademaker
20 months ago by
koenrademaker10 wrote:

I am trying to use the UniProtJAPI for Java to get the protein sequence based on the entry name and later identifying a consensus sequence. I'm working on NetBeans 8.2 and UniProt JAPI 1.0.14

I have a list of entry names along with the peptides found by mass spectrometry, a short example (not actual data) is:

MAOM_YEAST_R.LATYGGD.K

MAOM_YEAST is my key to the full sequence, after separating it from the partial sequence I want to use this to get the UniProt ID and from there the corresponding full protein sequence. This full protein sequence is to be used to extend the partial sequence to find possible matches outside of the range of the partial sequence. So to quickly sum up the steps:

MOAM_YEAST > find UniProtID (P36013) > find sequence > locate partial sequence > extend the partial sequence with ~5 amino acids in both directions > search for a consensus sequence

I have consulted the UniProt JAPI documentation (included in the download), but especially uk.ac.ebi.kraken.interfaces.uniprot / UniProtId is where the confusion starts. To cite from the documentation:

How to work with this Interface

The standard way of retrieving this data type

The standard way of setting this data type

UniProtEntry entry = getEntryFromParserOrAPI();

entry.setUniProtId(DefaultUniProtFactory.getInstance().buildUniProtId("CYC_HUMAN"));

UniProtId id = entry.getUniProtId();

However, I get errors for the UniProtEntry entry not being loaded properly due to the getEntryFromParserOrAPI() not working (don't have the exact error at the moment, will post it ASAP). This method seems like the ideal way to perform the action I want, replacing "CYC_HUMAN" with another name to get the proper entry. If I understand the documentation correctly a UniProtEntry should be able to get the UniProt ID based on an entry name like "CYC_HUMAN" using getUniProtID() and getSequence() could be used for the sequence.

My questions are:

1) Does anyone know how to go from an entry name like MOAM_YEAST or CYC_HUMAN to the corresponding UniProt ID and perhaps from there to the sequence?

2) Does anyone have a solution for the suggested code from the documentation to get it working?

Much thanks

uniprot sequence java • 717 views
ADD COMMENTlink modified 20 months ago by Pierre Lindenbaum120k • written 20 months ago by koenrademaker10
4
gravatar for Pierre Lindenbaum
20 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

1) Does anyone know how to go from an entry name like MOAM_YEAST or CYC_HUMAN to the corresponding UniProt ID and perhaps from there to the sequence?

$ curl -sL "http://www.uniprot.org/uniprot/MAOM_YEAST.xml" | \
     xmllint --xpath '//*[local-name()="sequence"]/text()' - | \
     tr -d '\n' |\
     awk '{S="LATYGGD";x=5;i=index($0,S);print substr($0,i-x,length(S)+2*x);}

SIECRLATYGGDKDVDY
ADD COMMENTlink modified 20 months ago • written 20 months ago by Pierre Lindenbaum120k

This seems like an interesting method and I well definitely try to translate it to Java to get it working, would you however also happen to know a solution using the UniProt JAPI?

ADD REPLYlink written 20 months ago by koenrademaker10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 782 users visited in the last hour