If I want to blast a large number of protein sequences against the ncbi-nr database (say for example in order to analyse the species and function composition with MEGAN), which of these options would be more sensible:
A.) to split the queries into subsets and run more jobs in parallel (using less threads each)
B.) to blast all queries in one job but using more threads
or doesn't it matter at all which of both I choose?
I was under the impression that simply using twice as many threads should have almost exactly the same effect as splitting the query data in two subsets and running them in parallel. Is this assumption wrong?