Question: Locating Citations
1
gravatar for Iddo
6.6 years ago by
Iddo230
Iddo230 wrote:

How would I go about finding citation networks in articles in pubmed? Ideally, given a pubmed ID for an article, I would like to have all the pubmed ID's of the articles that article cites.

• 3.2k views
ADD COMMENTlink modified 3.2 years ago by Maria_Levchenko60 • written 6.6 years ago by Iddo230

I should probably clarify that this is something I want yo do high volume. That is, either with a web API or (even better) a citation database.

ADD REPLYlink written 6.6 years ago by Iddo230

I should add that I found what seems to be a passable solution (see my answer below). Any comments would be helpful.

ADD REPLYlink modified 6.6 years ago • written 6.6 years ago by Iddo230
2
gravatar for Peter
6.6 years ago by
Peter5.8k
Scotland, UK
Peter5.8k wrote:

I would use the NCBI Entrez Utilities web interface, specifically Entrez Link (elink) can give you citations for PMC articles. Sadly this does not cover all articles in PubMed, only the citations for the subset in PubMedCentral (PMC) have this citation data. See http://www.ncbi.nlm.nih.gov/books/NBK25499/

There is an example of fetching citations for a PMC paper using the Biopython wrapper for Entrez in the Biopython Tutorial, http://biopython.org/DIST/docs/tutorial/Tutorial.html or http://biopython.org/DIST/docs/tutorial/Tutorial.pdf

ADD COMMENTlink modified 6.6 years ago • written 6.6 years ago by Peter5.8k
0
gravatar for Charles Warden
6.6 years ago by
Charles Warden7.8k
Duarte, CA
Charles Warden7.8k wrote:

There are several text-mining tools, but I think they are typically for words commonly present in the same publication (usually in the abstract):

http://cdwscience.blogspot.com/2013/03/bioinformatics-101-literature-text.html

It is not an algorithm per se, but I typically use Google Scholar to browse citations (typically going in the other direction, but I know it can work both ways since that is how it creates your library by default).

ResearchGate also provides citations (in both directions) for publications. For example, I can use it to see papers I have cited and papers ctiing my own papers:

https://www.researchgate.net/profile/Charles_Warden/publications/

ADD COMMENTlink written 6.6 years ago by Charles Warden7.8k

Thanks. should probably clarify that this is something I want to do high volume. That is, either with a web API or (even better) a citation database.

ADD REPLYlink written 6.6 years ago by Iddo230

I don't know a tool that does this, but I'd check out the links from the other response(s)

I feel like there must be some way to parse information in Google Scholar, but the technique may not be trivial. Hopefully some sort of database / text-mining expert is working on this ;)

ADD REPLYlink written 6.6 years ago by Charles Warden7.8k

Obviously haven't tried them out myself, but maybe these can be useful?

http://www.icir.org/christian/scholar.html

https://code.google.com/p/citations-gadget/

ADD REPLYlink written 6.6 years ago by Charles Warden7.8k
0
gravatar for Iddo
6.6 years ago by
Iddo230
Iddo230 wrote:

So here is what I came up with, based on helpful comments here and from Twitter. Thanks especially to Peter Cock, Karin Verspoor and Nick Semenkovich.

Apparently, things have progressed a bit at the NIH since the Biopython manual was written, and PubMed can be queried for PMIDs, and not only for PMC-available papers. How far back the ability to retrieve citations of non-PMC papers goes I am not sure. But it does work.

Code:

#!/usr/bin/env python
import sys
from Bio import Entrez as ez
ez.email = "your@emailhere.com"
def get_citations(pmid):
    """
    Returns the pmids of the papers this paper cites
    """
    cites_list = []
    handle = ez.efetch("pubmed", id=pmid, retmode="xml")
    pubmed_rec = ez.parse(handle).next()
    for ref in pubmed_rec['MedlineCitation']['CommentsCorrectionsList']:
        if ref.attributes['RefType'] == 'Cites':
            cites_list.append(str(ref['PMID']))
    return cites_list

if __name__ == '__main__':
    z = get_citations(sys.argv[1])
    for i in z:
        print i

So I tried to run this to see which papers the biopython paper (PMID: 19304878) cites:

 $ ./my_citations.py 19304878

11975335
10827456
14630660
14681378
14871861
15117750
3162770
7984417
16381881
16377612
17148479
17202161
17562476
18689808
12368254

One of the returned references, 10827456, is a TIGS paper, which is not in PMC.

Not all references were returned. But as far as I could tell, those references that were not returned were references to book chapters & articles that are not indexed in pubmed.

ADD COMMENTlink modified 6.6 years ago • written 6.6 years ago by Iddo230

After trying this in bulk, I see that Peter was right: although not necessarily directly overlapping with PMC, many pubmed articles do not have citations available.

ADD REPLYlink written 6.6 years ago by Iddo230

If you can work out what the scope of citation data is, it would be good to clarify the working in the Biopython Tutorial... - thanks

ADD REPLYlink written 6.6 years ago by Peter5.8k
0
gravatar for Maria_Levchenko
3.2 years ago by
EMBL-EBI
Maria_Levchenko60 wrote:

You can use Europe PMC RESTful API for this. The reference module (https://europepmc.org/RestfulWebService#refs) retrieves a count and list of publications referenced in a given publication. Construct the URL using your PMID: http://www.ebi.ac.uk/europepmc/webservices/rest/MED/[PMID]/references. The output format is XML or JSON. Note that not all publications will have reference lists available. To find publications which do, use query=has_reflist:y (http://www.ebi.ac.uk/europepmc/webservices/rest/search?query=has_reflist:y).

ADD COMMENTlink written 3.2 years ago by Maria_Levchenko60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 896 users visited in the last hour