Question: Batching PSIBLAST calls and obtaining individual PSSMs
0
gravatar for wjn0
10 months ago by
wjn00
wjn00 wrote:

I'm currently using PSIBLAST on a single query string to produce a PSSM which is used as output to a number of ML tools (PSIPRED, DISOPRED, etc.). I'd like to make this more efficient for running about 500k such sequences, but I need an individual PSSM output for each query string.

I know that the BLAST algorithm is made more efficient when running multiple query strings at the same time, but the psiblast CLI program only creates a single PSSM output for all query sequences when run this way. Is there a way around this?

Thanks for reading!

blast psiblast pssm • 306 views
ADD COMMENTlink modified 10 months ago by Mensur Dlakic6.0k • written 10 months ago by wjn00
0
gravatar for Mensur Dlakic
10 months ago by
Mensur Dlakic6.0k
USA
Mensur Dlakic6.0k wrote:

There is no way around this. You have to submit sequences individually. Once you have individual sequences, you can run multiple psiblast instances on each of those sequences. Note that you will need lots of memory for that unless your database is small(ish). I suggest you run a single psiblast query and find out what peak memory usage is before attempting to do multiple sequences simultaneously. Even if you have large RAM (512+ Gb), I still would suggest running at most 4-5 searches simultaneously as you will run into I/O problems.

ADD COMMENTlink written 10 months ago by Mensur Dlakic6.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1595 users visited in the last hour