Question: Retrieve PubMed records based on some genes
0
gravatar for willson
4 weeks ago by
willson0
willson0 wrote:

Hi,

I am trying to extract PubMed records via Bio Python library based one some gene names (e.g. all pmids which contains these gene names in their Abstracts). I wrote the following code and it is returning some results, but I am not sure that It is working correctly. I am wondering whether this code is going to miss some articles that contain similar gene Symbols (e.g. P53 for TP53) or Synonyms of them or not. And also, can I trust to PubMed filtering with this approach or I should get all of the abstracts and manually search/filter them.

handle = Entrez.esearch(db="pubmed", term="TP53[gene] AND BRCA1[gene] AND CXCL12[gene] ")
record = Entrez.read(handle)
idlist = record["IdList"]
handle = Entrez.efetch(db="pubmed", id=idlist, rettype="medline", retmode="text") # See medline format table
records = Medline.parse(handle)
records = list(records)
for record in records:
    print("title:", record.get("TI", "?"))
    print("authors:", record.get("AU", "?"))
    print("source:", record.get("SO", "?"))
    print("Abstract", record.get("AB","?")) #Abstracts
    print("")
pubmed biopython python gene • 121 views
ADD COMMENTlink written 4 weeks ago by willson0
1

I am going to make some general comments.

You will want to use OR instead of AND in your terms since I don't get any hits with all three genes in the example above with AND when using NCBI eUtils. A ton of hits appear, if the terms are used individually or combined with OR. What is your ultimate aim in doing this since there must be a lot of records in pubmed with these terms.

My search was done using:

esearch -db pubmed -query "TP53[gene] OR BRCA1[gene] OR CXCL12[gene]" | efetch -format abstract
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by genomax52k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1468 users visited in the last hour