I'm trying to find the number of authors publishing on a given topic per year via an Entrez Direct query to Pubmed. That is, I want to give it a query and get back the number of unique author names on publications each year, preferably in an xls or csv spreadsheet. Here's what I have so far:
esearch -db pubmed -query "[query]" | efetch -format xml | xtract -pattern PubmedArticle -block Author -sep " " -element LastName,Initials -block PubDate -sep " " -element Year | sort-uniq-count > filename.xls
Unfortunately, that's just giving me the year and a list of authors, each with a count of 1 next to it. The list looks like this for one of my queries:
1 Bondar SA Feklissowa ME Beloussowa ND 1965 1 BONDAR ZA FEKLISOVA ME BELOUSOVA ND 1965 1 DISANTAGNESE PA 1965 1 Georgi M Winkel K zum Prpic B 1965 1 HOLT PR HASHIM SA VANITALLIE TB 1965 1 KINNEY VR TAUXE WN DEARING WH 1965 1 KUO PT BASSETT DR DIGEORGE AM CARPENTER GG 1965 1 MALDONADO JE HANLON DG 1965 1 STICKLER GB PEYLA TL DOWER JC LOGAN GB 1965 1 Zujović J Milosević V Petrović L 1965
I've also tried moving the year to the first column, and that didn't help, but at least it was a bit neater.
Does anyone know how I can get the count of unique authors for each year?
Thank you in advance.