Hi All. I'm trying to output a search using the Entrez Direct: E-utilities on the UNIX Command Line. What I want to do is search using esearch, and using xtract output the following format:
pubmedid First Author Last Name / FA first initial First Author Affiliation Date Published
I can get some of the output I need using two different codes, but putting them together would be tricky, and harder than it should be, or so I imagine. The problem, I think lies with two different formats: docsum, and xml.
The first command I've been playing with:
./esearch -db pubmed -query "search string" | ./efilter -mindate 2005 | ./efetch -format docsum | ./xtract -pattern DocumentSummary -element MedlineCitation/PMID -element Id SortFirstAuthor | sort -t $'\t' -k 3,3n -k 2,2f
So that outputs the first two columns as desired. However, the docsum format doesn't contain information about affiliation.
Using this command:
./esearch -db pubmed -query "search string" | ./efilter -mindate 2005 | ./efetch -format xml | ./xtract -pattern PubmedArticle -element MedlineCitation/PMID -element Id SortFirstAuthor Affiliation -block PubDate -sep " " -element Year,Month MedlineDate | sort -t $'\t' -k 3,3n -k 2,2f
I get the pubmedid, and all affiliations of every author on each publication.
Does anyone know how I might tweak either of these codes? Is it possible to ignore all other authors except the first other?
All help is appreciated.