I'm trying to blast around 10,000 protein sequences against nr with blastp. In the past, using 100-sequence chunks and a single CPU each had worked well for blastn, but blastp seems to be much slower. A .fasta file with 100 sequences, running on a single core has not yet produced an output in 55 minutes.
I have BLAST+ installed in an HPC environment, with the datasets downloaded and indexed appropriately. I have tried blasting only one sequence using 16 cores:
blastp -query sequence.fasta -db nr -out test -outfmt 7 -num_threads 16
and it took around 10 minutes. The same sequence takes about a minute to process on the blast web server. I know it should go faster (per sequence) if I blast multiple sequences at once. Is there a way I can figure out what the optimum ratio of # of sequences vs. # of cores would be (other than trial and error, I guess)? I have access to 1000 CPUs at once, so it would be nice to find a decent balance.
Also, why is the web server much faster? Does it bundle together multiple queries or something? Or does our local blast setup potentially suffer from disk I/O issues?