Iterating through my list of search terms, only returns results from the last term in the list
0
1
Entering edit mode
2.6 years ago
jwang • 0

Hi I've been trying to loop through a list of search times I'm interested in. When I apply the first part of my code: I only get out publications matching the last search term in my list. Instead I want it to iterate through the list, and append the record_list object for each new search term, not just overwrite with the results from the last search. Thanks!!

from Bio import Entrez
from Bio import Medline
from tqdm import tqdm
import pandas as pd
pd.set_option('display.max_colwidth', -1)
import numpy as np

disease_list = ['ebola', 'aml', 'primary glomerular disease associated with significant proteinuria']

def search(x):
Entrez.email=Entrez.email
results = {}
for x in disease_list:
keyword = x
handle = Entrez.esearch(db ='pubmed',
retmax=1000,
retmode ='text',
term = keyword)
print('Total number of publications that contain the term {}: {}'.format(keyword, results['Count']))
for keyword, results['Count'] in results:
results[x].append(results['Count'])
return results

if __name__ == '__main__':
results = search(disease_list)

Esearch Entrez.esearch Pubmed Abstracts • 687 views
1
Entering edit mode

If youre only getting the last of an entry in a loop, its because somewhere your loop is overwriting the entry instead of appending new ones.

I've tried to fix your formatting for the code, but at the moment, your indentation and loop structures are not clear, so its not obvious which bits you mean to have in which loop, and thus where your problem comes from. Please make sure the code appears correct.

0
Entering edit mode

Sorry about that, here let's try again, I turned it into a function so it's easier to read, this is the part I'm getting stuck on:

from Bio import Entrez
from Bio import Medline
from tqdm import tqdm
import pandas as pd
pd.set_option('display.max_colwidth', -1)
import numpy as np

#############################################################################
disease_list = ['ebola', 'aml', 'primary glomerular disease associated with significant proteinuria']

def search(x):
Entrez.email=Entrez.email
results = {}
for x in disease_list:
keyword = x
handle = Entrez.esearch(db ='pubmed',
retmax=1000,
retmode ='text',
term = keyword)
print('Total number of publications that contain the term {}: {}'.format(keyword, results['Count']))
for keyword, results['Count'] in results:
results[x].append(results['Count'])
return results

if __name__ == '__main__':
results = search(disease_list)