My problem is the following: I have a list of GI identifiers form the NCBI nucleotide database. For instance take just this one: `76365841`. I want to extract the "isolation source" term from it. The answer here is "Everglades wetlands" which you can see by using the "efetch".
Not all sequences have that information, but for example with Entrez direct:
epost -db nuccore -id 22506766,332640072,76365841,389043336 | efetch -format docsum | xtract -element SubName | tr "\t" "\n"
PB131|Panama: Panama Province, Las Cumbres Lake|lake water at 5 m depth during dry season|9.0986 N 79.5392 W
F124|USA: Florida|Everglades wetlands
SFD1-19|USA: San Francisco Delta, Mildred Island 2000-07-20
I'm not so sure the SubName element is standard though.