Pubchem Patents Bulk Download
1
The ftp download site of pubchem gives only downloads for Compound information. Is there a way to bulk download patent data for each compound?
Pubchem
Patent
• 2.4k views
here is some python code you can use if you have a CID in mind
cid=2446
pubchemapi = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/{0}/xrefs/PatentID/JSON".format(cid)
try:
url=urllib.request.urlopen(pubchemapi)
except urllib.error.HTTPError as err:
print("tried {} will sleep on it".format(pubchemapi))
time.sleep(5)
try:
url=urllib.request.urlopen(pubchemapi)
except urllib.error.HTTPError as err:
print("can't find {0}".format(cid))
return(None)
pbcresp = json.loads(url.read().decode())
patents = pbcresp['InformationList']['Information'][0]['PatentID']
for patent in patents:
if patent.startswith('US'):
patentapi = "https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/patent/{0}/JSON?heading=Patent+Submission+Date".format(patent)
try:
url=urllib.request.urlopen(patentapi)
except urllib.error.HTTPError as err:
print("tried {} will sleep on it".format(patentapi))
time.sleep(5)
try:
url=urllib.request.urlopen(patentapi)
except urllib.error.HTTPError as err:
print("can't find {0}".format(cid))
continue
patresp = json.loads(url.read().decode('latin-1'))
submissiondate = patresp['Record']['Section'][0]["Information"][0]["Value"]["DateISO8601"][0]
print("{0} {1} {2}".format(cid,patent,submissiondate))
Login before adding your answer.
Traffic: 1691 users visited in the last hour
Please provide some details what exactly you need including a link to the respective page. See How To Ask Good Questions On Technical And Scientific Forums