Question: about Java Access of Complete PubMed IDs or Titles Attached to One Entry
0
gravatar for taojincs
19 months ago by
taojincs30
taojincs30 wrote:

Is there any way to access ALL the publications in java attached to one swissprot entry?

I know how to use the method List<citation> citation = current.getCitationsNew(). I can get the titles of these publications too. However, the list of publications is not complete. For example, https://www.uniprot.org/uniprot/P0AD86/publications, publication 7 and 8 will not be included in the output.

I checked the documentations carefully: ttps://www.ebi.ac.uk/uniprot/japi/javadoc/uk/ac/ebi/kraken/interfaces/uniprot/UniProtEntry.html https://www.ebi.ac.uk/uniprot/japi/usage.html https://www.ebi.ac.uk/uniprot/japi/javadoc/uk/ac/ebi/uniprot/dataservice/query/class-use/Query.html#uk.ac.ebi.uniprot.dataservice.client.uniprot

I didn't find other useful functions from API. The only way I came up with is to use URL to access the source html page and parse the title out. I prefer not to do this as it is not stable and might not work in the future.

Is there any way to get the complete list of publication PubMed IDs or titles of one protein entry in java using existing API?

Thank you.

pubmed api java swissprot • 466 views
ADD COMMENTlink modified 19 months ago by Elisabeth Gasteiger1.8k • written 19 months ago by taojincs30
0
gravatar for Pierre Lindenbaum
19 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

here is a basic code using xpath:

$ javac Biostar376339.java 
$ java Biostar376339 
287010 Regulation of the threonine operon: tandem threonine and isoleucine codons in the control region and translational control of transcription termination.
6277952 Initiation, pausing, and termination of transcription in the threonine operon regulatory region of Escherichia coli.
2410621 Identification and characterization of mutants affecting transcription termination at the threonine operon attenuator.
7610040 Analysis of the Escherichia coli genome VI: DNA sequence of the region from 92.8 through 100 minutes.
9278503 The complete genome sequence of Escherichia coli K-12.
16738553 Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110.

I'm using xpath but for something more complete/beautiful you'de better use XSLT.

ADD COMMENTlink written 19 months ago by Pierre Lindenbaum131k
0
gravatar for Elisabeth Gasteiger
19 months ago by
Geneva
Elisabeth Gasteiger1.8k wrote:

I am not familiar with the EBI java API, but I may be able to shed some light on the reason why publications 7 and 8 of P0AD86 are not included:

https://www.uniprot.org/uniprot/P0AD86/publications lists 8 publications, the first 6 of which have been curated by a biocurator and are actually listed in the database record (as you can see in text, xml or rdf formats): e.g. text: https://www.uniprot.org/uniprot/P0AD86.txt

The remainder of the publications have been computationally mapped: https://www.uniprot.org/uniprot/P0AD86/publications?query=&fil=Mapped

You can obtain their PubMed IDs with this query:

https://www.uniprot.org/citations/?query=mappedin:(id:P0AD86)

and download them with

https://www.uniprot.org/citations/?query=mappedin:(id:P0AD86)&format=list The latter can be used programmatically (REST), and no specific parsing is required (tab-separated, excel, xml/rdf formats are also available).

Similarly, all citations (those in the entry AND the mapped ones) can be obtained with the query

https://www.uniprot.org/citations/?query=mappedin:(id:P0AD86)+or+citedin:(id:P0AD86)&sort=score (use format={tab|list|rdf} for programmatic use).

ADD COMMENTlink written 19 months ago by Elisabeth Gasteiger1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1529 users visited in the last hour