Question: Running BLAST tool using python's multiprocessing package
0
gravatar for m_asi
18 months ago by
m_asi0
Portugal
m_asi0 wrote:

I am trying to run online NCBI BLAST in parallel using python multiprocessing package. While running the code. the following error has occurred:

Process Process-4:
Traceback (most recent call last):
  File "C:\Users\muh_asif\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
  File "C:\Users\muh_asif\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\muh_asif\PycharmProjects\parellel\index.py", line 16, in f
    result_handle = NCBIWWW.qblast("blastn", "nt", record.format("fasta"),entrez_query=j, hitlist_size=1)
  File "C:\Users\muh_asif\PycharmProjects\parellel\venv\lib\site-packages\Bio\Blast\NCBIWWW.py", line 141, in qblast
    rid, rtoe = _parse_qblast_ref_page(handle)
  File "C:\Users\muh_asif\PycharmProjects\parellel\venv\lib\site-packages\Bio\Blast\NCBIWWW.py", line 253, in _parse_qblast_ref_page
    raise ValueError("Error message from NCBI: %s" % msg)
ValueError: Error message from NCBI: Cannot accept request, error code: -1

This error occurred for many processes, for example for process # 5 and 6 as well.

Apparently NCBI did not accept values for some processes. Is there a way to fix this error? Am I allowed to submit 3 or 4 queries to NCBI at the same time?

Secondly, the processes are only created for the first element of taxa_id_list list not for the second element. Is there a better way to run BLAST in parallel using multiprocessing package? I am new to multiprocessing and I am trying to make BLAST run faster. Here is the link for input file (input_file) and the code is:

from  multiprocessing import current_process
from Bio.Blast import NCBIXML
from Bio.Blast import NCBIWWW
from Bio import SeqIO

def f(record, j, id):
    record = str(record)
    print(record)
    j = str(j)
    print(j)
    proc_name = current_process().name
    print(f"Process name: {proc_name}")

    result_handle = NCBIWWW.qblast("blastn", "nt", record.format("fasta"),entrez_query=j, hitlist_size=1)
    blast_records = NCBIXML.parse(result_handle)

    for blast_record in blast_records:
        for alignment in blast_record.alignments:
            print(f"accession num: {alignment.accession} for ID: {id}")



if __name__ == '__main__':

    from  multiprocessing import Process

    fasta_file_name = 'dummy_fasta.fasta'  
    my_fasta = SeqIO.parse(fasta_file_name, "fasta")
    #to restrict blast to a specific  specie.
    taxa_id_list = ["txid9606 [ORGN]", "txid39442 [ORGN]"]

    processes = []
    for j in taxa_id_list:
        for k in my_fasta: # read all sequences from fasta file
            seq = k.seq
            id = k.id
            process = Process(target=f, args=(seq, j, id))
            processes.append(process)
            process.start()
    for l in processes:
        l.join()

thank you.

ADD COMMENTlink modified 18 months ago • written 18 months ago by m_asi0

NCBI is very likely to rate limit you, try just sending 2 or 3 requests at most at once.

ADD REPLYlink written 18 months ago by Devon Ryan97k

Blast itself if multi-threaded so each job you start can use more than one thread. I am not sure why you want to use multi-processing to submit remote blast jobs. Please be considerate of this public resource.

The NCBI WWW BLAST server is a shared resource, and it would be unfair for a few users to monopolize it. To prevent this, the server gives priority to interactive users who run a moderate number of searches. The server also keeps track of how many queries are in the queue for each user as well as how many searches a user has performed recently and prioritizes searches accordingly.

ADD REPLYlink written 18 months ago by GenoMax92k

Devon Ryan and genomax thank you for your replies. I was not aware about the NCBI restrictions. now, I will submit max one or two requests to BLAST at once.

ADD REPLYlink written 18 months ago by m_asi0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1519 users visited in the last hour