Question

Associations between biological terms and gene names in the literature

0

Entering edit mode

6.7 years ago

Sergio Martínez Cuesta ▴ 230

Dear all,

Given a biological term(s) of interest (e.g. G-quadruplex: MeSH Unique ID: D054856), I am looking for a robust tool to extract all known human gene names (e.g. TP53, P53) found to co-occur in proximity to my query biological term in the entire body of abstracts / freely available main texts in the whole of PubMed.

On the basis of a previous post, the most useful tool that I have tested so far is PolySearch2. You can type a query keyword after selecting Given:Text and marking Find ALL associated: e.g. Genes/proteins. In the end of the day, you can get a result table like the following and download all associations in .json format.

However I can think of two drawbacks though:

(1) Although it returns an Entity id that can be later searched for the corresponding gene name in the thesauri provided in the Downloads, it does not return any standard Ensembl gene ids or UniprotKB accessions.

(2) I could not find a way to extract results from a given species only e.g. Homo sapiens

Are you aware of any other tool that would allow me to find similar associations between biological terms in general and gene names?

Thanks!

text mining pubmed literature • 1.9k views

ADD COMMENT • link updated 6.2 years ago by Maria_Levchenko ▴ 60 • written 6.7 years ago by Sergio Martínez Cuesta ▴ 230

score 1 · Answer 1 · 2018-01-29

1

Entering edit mode

6.2 years ago

Maria_Levchenko ▴ 60

Europe PMC offers an open RESTful annotations API. You can access text mined annotations contained in PubMed abstracts and open access full text articles, and find co-occurences between different text-mined entities: e.g. gene names and GO terms that appear in the same paper. The list of text-mined annotations includes accessions, genes/proteins, chemicals, organisms, diseases, Gene Ontology, gene mutations, gene-disease relationships, gene functions, protein-protein interactions, phosphosrylation events, etc. You can try the API here: https://europepmc.org/AnnotationsApi

ADD COMMENT • link 6.2 years ago by Maria_Levchenko ▴ 60

0

Entering edit mode

Thanks Maria,

Can you elaborate on how to derive a score for strength of association between text mined annotations obtained using Europe PMC API, please?

ADD REPLY • link 6.2 years ago by Sergio Martínez Cuesta ▴ 230

0

Entering edit mode

For annotations in Europe PMC there is no information on association scores. For named entity annotations (genes, GO, diseases, organisms...) you can only infer co-occurence in a specific section of the article. Annotations reporting relationships, e.g. gene-disease relationships, are supplied by external providers, who may be displaying the association info on their side. For example, Open Targets scores target-disease associations based on their data sources, which in addition to literature may include genetic associations, somatic mutations, etc. This info can be retrieved from them using Open Targets API.

ADD REPLY • link 6.1 years ago by Maria_Levchenko ▴ 60

0

Entering edit mode

If you'd like to learn more about the annotations API, and what it offers you can join an upcoming webinar: https://www.ebi.ac.uk/training/events/2018/extracting-research-evidence-publications

ADD REPLY • link 6.1 years ago by Maria_Levchenko ▴ 60