My problem is the following: I have a list of GI identifiers form the NCBI nucleotide database. For instance take just this one: `76365841`. I want to extract the "isolation source" term from it. The answer here is "Everglades wetlands" which you can see by using the "efetch".
However when I hit a full chromosome that has a huge sequence, my program will download the full sequence and the biopython Entrez.parser is unable to handle that. For instance with: `332640072`
Is there any way of building a request to NCBI to batch download the sequences information (including isolation source) WITHOUT downloading the actual sequence in terms of AGTC.
If you want to see the program:
#python from Bio import Entrez gis = ['332640072', '76365841', '22506766', '389043336'] response = Entrez.efetch(db="nucleotide", id=gis, retmode="xml") records = list(Entrez.parse(response, validate=True))