Does MyGene have request limitations? - 400 Client Error for POST method
1
0
Entering edit mode
3.5 years ago
brendan • 0

We have a RNAseq analysis pipeline hosted on AWS where we pump hundreds of RNAseq samples through an alignment + gene counting pipeline. At the end of this pipeline we use MyGene to generate gene symbols for all the genes that have counts. This worked well during testing, but once deployed into production I started a multiple runs, each containing hundreds of samples being run at the same time. This likely totaled to millions of gene_ids being queried through mygene resulting in the following error:

gene_symbol_queries = mg.querymany(stats_df["Geneid"], "ensembl.gene", fields="symbol", returnall=False, as_dataframe=True) File "/usr/local/lib/python3.6/site-packages/biothings_client/base.py", line 542, in _querymany for hits in self._repeated_query(query_fn, qterms, verbose=verbose): File "/usr/local/lib/python3.6/site-packages/biothings_client/base.py", line 223, in _repeated_query from_cache, query_result = query_fn(batch, *fn_kwargs) File "/usr/local/lib/python3.6/site-packages/biothings_client/base.py", line 541, in query_fn def query_fn(qterms): return self._querymany_inner(qterms, verbose=verbose, *kwargs) File "/usr/local/lib/python3.6/site-packages/biothings_client/base.py", line 488, in _querymany_inner return self._post(_url, params=_kwargs, verbose=verbose) File "/usr/local/lib/python3.6/site-packages/biothings_client/base.py", line 176, in _post res.raise_for_status() File "/usr/local/lib/python3.6/site-packages/requests/models.py", line 941, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: search_phase_execution_exception for url: http://mygene.info/v3/query/

This error is raised for about 30% of all our jobs. (1108 Succeeded, 605 failed). Is the high traffic causing this error? Do you have any advice for a way to get around this issue?

mygene annotation • 801 views
ADD COMMENT
1
Entering edit mode
3.5 years ago
GenoMax 141k

This likely totaled to millions of gene_ids being queried through mygene

People provide these services for casual use not for production sequencing/analysis. You should email mygene project owners and find a solution. Looks like the software is open source but it is not clear if you can easily install a local copy with all needed data. That would be the ideal solution considering the volume of your data.

ADD COMMENT
0
Entering edit mode

Sounds good, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6