I want to make a complex query / a lot of queries to pubmed api.
My problem is I have a lot of gene symbols (~ 20.000) and some term as input (for example, inflammation). I want to search through all pubmed titles and abstracts and get list of most popular genes that occure with the term. So my naive algorithm is to make 20000 queries like
gene1 AND term gene2 AND term ..., gene20000 AND term
and sort number of results for each query. But of course I can't do so much queries (there is a limit for number of queries per second).
Another way is to make a query for the term, download all results and after that make search locally. But there could be a lot of results and process of dowloading may take hours in such case.
Do you know any way to make such queries relatively fast?
Thank you! Didn't know about that database before.
Although it looks like this database isn't full, at least two random PMIDs weren't found there.
these are the PMID's declared in the 'gene' database.