Question: Associations between biological terms and gene names in the literature
gravatar for Sergio Martínez Cuesta
22 months ago by
Cambridge, UK

Dear all,

Given a biological term(s) of interest (e.g. G-quadruplex: MeSH Unique ID: D054856), I am looking for a robust tool to extract all known human gene names (e.g. TP53, P53) found to co-occur in proximity to my query biological term in the entire body of abstracts / freely available main texts in the whole of PubMed.

On the basis of a previous post, the most useful tool that I have tested so far is PolySearch2. You can type a query keyword after selecting Given:Text and marking Find ALL associated: e.g. Genes/proteins. In the end of the day, you can get a result table like the following and download all associations in .json format.

However I can think of two drawbacks though:

(1) Although it returns an Entity id that can be later searched for the corresponding gene name in the thesauri provided in the Downloads, it does not return any standard Ensembl gene ids or UniprotKB accessions.

(2) I could not find a way to extract results from a given species only e.g. Homo sapiens

Are you aware of any other tool that would allow me to find similar associations between biological terms in general and gene names?


pubmed text mining literature • 698 views
ADD COMMENTlink modified 16 months ago by Maria_Levchenko60 • written 22 months ago by Sergio Martínez Cuesta60
gravatar for Maria_Levchenko
16 months ago by
Maria_Levchenko60 wrote:

Europe PMC offers an open RESTful annotations API. You can access text mined annotations contained in PubMed abstracts and open access full text articles, and find co-occurences between different text-mined entities: e.g. gene names and GO terms that appear in the same paper. The list of text-mined annotations includes accessions, genes/proteins, chemicals, organisms, diseases, Gene Ontology, gene mutations, gene-disease relationships, gene functions, protein-protein interactions, phosphosrylation events, etc. You can try the API here:

ADD COMMENTlink written 16 months ago by Maria_Levchenko60

Thanks Maria,

Can you elaborate on how to derive a score for strength of association between text mined annotations obtained using Europe PMC API, please?

ADD REPLYlink written 16 months ago by Sergio Martínez Cuesta60

For annotations in Europe PMC there is no information on association scores. For named entity annotations (genes, GO, diseases, organisms...) you can only infer co-occurence in a specific section of the article. Annotations reporting relationships, e.g. gene-disease relationships, are supplied by external providers, who may be displaying the association info on their side. For example, Open Targets scores target-disease associations based on their data sources, which in addition to literature may include genetic associations, somatic mutations, etc. This info can be retrieved from them using Open Targets API.

ADD REPLYlink written 15 months ago by Maria_Levchenko60

If you'd like to learn more about the annotations API, and what it offers you can join an upcoming webinar:

ADD REPLYlink written 15 months ago by Maria_Levchenko60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1033 users visited in the last hour