Question: python module to download any paper by DOI
0
gravatar for akarazeev
14 months ago by
akarazeev0
Moscow/MIPT
akarazeev0 wrote:

Hi. I am wondering is there any python module that allows to download paper by its doi?

I am familiar with SciHub module (https://github.com/zaytoun/scihub.py) but there is a problem with captchas.

Probably you know other methods of downloading papers using some API.

I'm ready to pay some money to avoid captchas but personally I don't know any resource that offers this option.

Thank you in advance.

pubmed doi download scihub python • 1.4k views
ADD COMMENTlink modified 8 months ago by Maria_Levchenko70 • written 14 months ago by akarazeev0
0
gravatar for jrj.healey
14 months ago by
jrj.healey7.7k
United Kingdom
jrj.healey7.7k wrote:

It's not a module per se, and isn't specific for DOIs but you could do something like the following:

def getDOI(top_hit):
    """Query the PDB REST API to get an associated DOI/Publication"""
    import requests
    try:
        query = requests.get("https://www.ebi.ac.uk/pdbe/api/pdb/entry/publications/" + str(top_hit))
        qjson = query.json()
        doi = qjson[top_hit][0]['doi']

        if not doi:
            doi = "No DOI found."

    except KeyError:
        doi = "Key error. ID likely deprecated."

    return doi

I use this code snippet to return DOIs from a PDB ID query. You may be able to chop it up to suit your own needs.

Otherwise take a look at the esearch/efetch options from Bio.Entrez (http://biopython.org/DIST/docs/api/Bio.Entrez-module.html)

ADD COMMENTlink modified 14 months ago • written 14 months ago by jrj.healey7.7k
0
gravatar for Maria_Levchenko
8 months ago by
EMBL-EBI
Maria_Levchenko70 wrote:

You could try Europe PMC API to retrieve publication full text via DOI. You would need to first map the DOI to the corresponding PMCID using the search module, and then use the PMCID to retrieve full text XML from the open access subset. Here is an example: search module for the following DOI (10.1371/journal.ppat.1002485) returns PMC3257301 as a PMCID, then fullTextXML module for PMC3257301 retrieves the full text.

ADD COMMENTlink written 8 months ago by Maria_Levchenko70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 752 users visited in the last hour