Question: Extract PMIDs from a gene or protein ID
0
gravatar for shelly.deforte
4.2 years ago by
United States
shelly.deforte190 wrote:

Given a uniprot ID, I am trying to automatically extract related pubmed IDs (PMIDs) from pubmed. I can map the UniProt ID to something NCBI can understand. For instance, UniProt ID O14733 can be mapped to GI: 6831583 and then you can launch a search from http://www.ncbi.nlm.nih.gov/protein/O14733 to see the associated pubmed articles with the URL http://www.ncbi.nlm.nih.gov/pubmed?linkname=protein_pubmed_weighted&from_uid=6831583

I have never used ncbi's e-utils, so it may be a very simple modification to be able to fetch these articles automatically, but I can't figure it out. My best guess was http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&LinkName=protein_pubmed_weighted&from_uid=6831583, but this returns nothing. 

Basically, given an ID such as 683583, I want to return a list of PMIDs. I would rather do this in python if possible. Any suggestions?

pubmed • 1.9k views
ADD COMMENTlink modified 4.2 years ago by asking.for.help10 • written 4.2 years ago by shelly.deforte190
1

Similar posts: A: edirect: Entrez Unix Command line

A: Retrieve Pubmed Ids

ADD REPLYlink written 4.2 years ago by Ashutosh Pandey11k
2
gravatar for David W
4.2 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

If you use the link EUtil with dbfrom="protein", db="pubmed" you'll get a list of pmids associated with that protein. 

You can then use esummary or efetch on those pmids.

ADD COMMENTlink written 4.2 years ago by David W4.7k

Thanks, that's what I needed! 

ADD REPLYlink written 4.2 years ago by shelly.deforte190
3
gravatar for shelly.deforte
4.2 years ago by
United States
shelly.deforte190 wrote:

For completeness, here's what I worked out with David W's help. The construction of the URL is such: 

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=protein&db=pubmed&id=215274019&linkname=protein_pubmed_weighted

This is how I was able to retrieve the records in Biopython: 
 

protein_ID = "215274019"

handle = Entrez.elink(db="pubmed", dbfrom="protein", id=protein_id, linkname="protein_pubmed_weighted")
record = Entrez.read(handle)

for PMID in record[0]['LinkSetDb'][0]['Link']:
    print PMID['Id']
ADD COMMENTlink written 4.2 years ago by shelly.deforte190

Hey, I am trying to do the same thing basically. I have a q. D o you know the differences between the different linknames, i.e., protein_pubmed_weighted, protein_pubmed, and protein_pubmed_refseq?

Best, Nils

ADD REPLYlink written 2.9 years ago by nils.rudqvist20
1
gravatar for asking.for.help
4.2 years ago by
Switzerland
asking.for.help10 wrote:

Depending on the scope of your project, you might want to directly download NCBI's look-up table between EntrezIDs and PubmedIDs and integrate this table into your workflow.

ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2pubmed.gz

I like this table because it enables easy (and computationally fast) filtering; e.g.: exclude papers, which cover 100s -1000s of different genes (and usually thus do not reveal gene-specific biology). e.g.: find genes, which are only mentioned together with your genes of interest

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by asking.for.help10

This looks like a great resource, though I don't think I can easily map my UniProt IDs to genes, and I think it might change the coverage of the papers if I did. Still, I'm definitely going to bookmark this folder, thanks!

ADD REPLYlink written 4.2 years ago by shelly.deforte190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 983 users visited in the last hour