I would like to write a edirect query to extract number of publications per gene per year. The group I am interested in is Viridiplantae. So for all species under this group, given a date range, I would like to get the publication count for each gene in that species. The final output that I am looking for is something like
YEAR Genus_Species Gene_Symbol Publication_Count 1970 Arabidopsis thaliana PHYA 3 1971 Arabidopsis thaliana PHYA 2
I have gotten this far. For an example gene id (816394) in taxon Arabidopsis thaliana (txid3702) I can get the count of all the pubmed articles related to this gene
esearch -db gene -query "txid3702[Organism:exp] AND 816394[UID]" | elink -target pubmed
After this the next step is to download in xml or docsum format the articles and filter the articles by date [PDAT] of publication. This is the strategy I am using. I used this next command but the error was "Too many requests"
esearch -db gene -query "txid3702[Organism:exp] AND 816394[UID]" | elink -target pubmed | efetch -format xml | xtract -pattern PubmedArticle -element PubDate
I don't know how to get around this. Thanks for the help