Locating Citations
4
1
Entering edit mode
10.3 years ago
Iddo ▴ 230

How would I go about finding citation networks in articles in pubmed? Ideally, given a pubmed ID for an article, I would like to have all the pubmed ID's of the articles that article cites.

• 5.0k views
ADD COMMENT
0
Entering edit mode

I should probably clarify that this is something I want yo do high volume. That is, either with a web API or (even better) a citation database.

ADD REPLY
0
Entering edit mode

I should add that I found what seems to be a passable solution (see my answer below). Any comments would be helpful.

ADD REPLY
2
Entering edit mode
10.3 years ago
Peter 6.0k

I would use the NCBI Entrez Utilities web interface, specifically Entrez Link (elink) can give you citations for PMC articles. Sadly this does not cover all articles in PubMed, only the citations for the subset in PubMedCentral (PMC) have this citation data. See http://www.ncbi.nlm.nih.gov/books/NBK25499/

There is an example of fetching citations for a PMC paper using the Biopython wrapper for Entrez in the Biopython Tutorial, http://biopython.org/DIST/docs/tutorial/Tutorial.html or http://biopython.org/DIST/docs/tutorial/Tutorial.pdf

ADD COMMENT
0
Entering edit mode
10.3 years ago

There are several text-mining tools, but I think they are typically for words commonly present in the same publication (usually in the abstract):

http://cdwscience.blogspot.com/2013/03/bioinformatics-101-literature-text.html

It is not an algorithm per se, but I typically use Google Scholar to browse citations (typically going in the other direction, but I know it can work both ways since that is how it creates your library by default).

ResearchGate also provides citations (in both directions) for publications. For example, I can use it to see papers I have cited and papers ctiing my own papers:

https://www.researchgate.net/profile/Charles_Warden/publications/

ADD COMMENT
0
Entering edit mode

Thanks. should probably clarify that this is something I want to do high volume. That is, either with a web API or (even better) a citation database.

ADD REPLY
0
Entering edit mode

I don't know a tool that does this, but I'd check out the links from the other response(s)

I feel like there must be some way to parse information in Google Scholar, but the technique may not be trivial. Hopefully some sort of database / text-mining expert is working on this ;)

ADD REPLY
0
Entering edit mode

Obviously haven't tried them out myself, but maybe these can be useful?

http://www.icir.org/christian/scholar.html

https://code.google.com/p/citations-gadget/

ADD REPLY
0
Entering edit mode
10.3 years ago
Iddo ▴ 230

So here is what I came up with, based on helpful comments here and from Twitter. Thanks especially to Peter Cock, Karin Verspoor and Nick Semenkovich.

Apparently, things have progressed a bit at the NIH since the Biopython manual was written, and PubMed can be queried for PMIDs, and not only for PMC-available papers. How far back the ability to retrieve citations of non-PMC papers goes I am not sure. But it does work.

Code:

#!/usr/bin/env python
import sys
from Bio import Entrez as ez
ez.email = "your@emailhere.com"
def get_citations(pmid):
    """
    Returns the pmids of the papers this paper cites
    """
    cites_list = []
    handle = ez.efetch("pubmed", id=pmid, retmode="xml")
    pubmed_rec = ez.parse(handle).next()
    for ref in pubmed_rec['MedlineCitation']['CommentsCorrectionsList']:
        if ref.attributes['RefType'] == 'Cites':
            cites_list.append(str(ref['PMID']))
    return cites_list

if __name__ == '__main__':
    z = get_citations(sys.argv[1])
    for i in z:
        print i

So I tried to run this to see which papers the biopython paper (PMID: 19304878) cites:

 $ ./my_citations.py 19304878

11975335
10827456
14630660
14681378
14871861
15117750
3162770
7984417
16381881
16377612
17148479
17202161
17562476
18689808
12368254

One of the returned references, 10827456, is a TIGS paper, which is not in PMC.

Not all references were returned. But as far as I could tell, those references that were not returned were references to book chapters & articles that are not indexed in pubmed.

ADD COMMENT
0
Entering edit mode

After trying this in bulk, I see that Peter was right: although not necessarily directly overlapping with PMC, many pubmed articles do not have citations available.

ADD REPLY
0
Entering edit mode

If you can work out what the scope of citation data is, it would be good to clarify the working in the Biopython Tutorial... - thanks

ADD REPLY
0
Entering edit mode
6.9 years ago

You can use Europe PMC RESTful API for this. The reference module (https://europepmc.org/RestfulWebService#refs) retrieves a count and list of publications referenced in a given publication. Construct the URL using your PMID: http://www.ebi.ac.uk/europepmc/webservices/rest/MED/[PMID]/references. The output format is XML or JSON. Note that not all publications will have reference lists available. To find publications which do, use query=has_reflist:y (http://www.ebi.ac.uk/europepmc/webservices/rest/search?query=has_reflist:y).

ADD COMMENT

Login before adding your answer.

Traffic: 3022 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6