Get publication counts per gene per year
Entering edit mode
4.5 years ago
abhijit.synl ▴ 60


I would like to write a edirect query to extract number of publications per gene per year. The group I am interested in is Viridiplantae. So for all species under this group, given a date range, I would like to get the publication count for each gene in that species. The final output that I am looking for is something like

YEAR Genus_Species Gene_Symbol Publication_Count
1970 Arabidopsis thaliana PHYA 3
1971 Arabidopsis thaliana PHYA 2

I have gotten this far. For an example gene id (816394) in taxon Arabidopsis thaliana (txid3702) I can get the count of all the pubmed articles related to this gene

esearch -db gene -query "txid3702[Organism:exp] AND 816394[UID]" | elink -target pubmed

After this the next step is to download in xml or docsum format the articles and filter the articles by date [PDAT] of publication. This is the strategy I am using. I used this next command but the error was "Too many requests"

esearch -db gene -query "txid3702[Organism:exp] AND 816394[UID]" | elink -target pubmed | efetch -format xml | xtract -pattern PubmedArticle -element PubDate

I don't know how to get around this. Thanks for the help

eutilities • 889 views

Login before adding your answer.

Traffic: 2114 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6