How to do multiple blast for multifasta file/s with command line blast
1
0
Entering edit mode
8.0 years ago
tcf.hcdg ▴ 70

Hello I have a dataset of around 30,000 query sequences and would like to do blast search for this dataset against 4 different databases.

I know to do blast with command line option and it is working fine for individual case.

I wonder If there any way to do multiple blast automatically with command line option where databases and blast parameters are different for each case.

blast • 5.9k views
ADD COMMENT
1
Entering edit mode

While setting these up would be relatively straightforward with a for loop, the question is are you planning to submit the jobs via a scheduler to a cluster or run them locally in serial fashion. That is a big input dataset. Do you need to submit 30,000 separate queries or are you planning to chunk them as sets of multi-fasta files?

ADD REPLY
0
Entering edit mode

I would like to run them locally on my computer. Sequences are in multifasta file and I need to do blast for each of the file separately with different parameters and different database

ADD REPLY
0
Entering edit mode

where databases and blast parameters are different for each case.

what is your "model" ? do you have a file with the fasta file(s) and the conditions ? a tsv file ? a xml file ?

ADD REPLY
0
Entering edit mode

Actually I have 4 different data files that contains (10k,20k,30k,40k) sequences respectively. What I would like to search each of the files in four different database (reference1, reference2,reference3,reference4).
the parameter what I need to search are 2 i-e % identity 70% and 90%. In total I would like to do (4x4 = 16x2= 32 )blast search.

I would like to do it locally on my computer with 2.2.31+... Is there any way to do it automatically means I write the paramaters and database for each blast in a script and got the results in 32 files one for each..

ADD REPLY
0
Entering edit mode

I made four database locally on my computer with makeblastdb options

ADD REPLY
1
Entering edit mode
8.0 years ago

you could define your model into a json/javascript . Below is a program for jrunscript https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jrunscript.html


var out=java.lang.System.out;
var model = {
"fasta":["f1.fa","f2.fa","f3.fa"],
"database": ["db1.fa", "db2.fa","db6.fa","db7.fa"],
"evalue":[1.0,0.7],
"identity":[70,80]
};

var i=0,j,k,m,targetid=0;
out.println("T=");
out.println("all: all_targets");

for(i=0;i< model.database.length;++i)
{
var name=  model.database[i];
out.println("$(addsuffix .nin,"+name+"): "+name);
out.println("\tmakeblastdb -dbttype nucl -in $<");
}



for(i in model.fasta)
    for(j in model.database)
        for(k in model.evalue)
            for(m in model.identity) {
                targetid++;
                out.println("T+= t"+targetid+".blast");
                out.println("t"+targetid+".blast : $(addsuffix .nin,"+model.database[j]+") "+model.fasta[i]);
                out.println("\tblastn -db "+model.database[j]+" -out $@ -evalue "+ model.evalue[k]+" -query "+model.fasta[i]+" -perc_identity "+model.identity[m]);

                }


out.println("all_targets: ${T}");

The program loop over the parameters and generate a Makefile. Here is the output of jrunscript input.js

T=
all: all_targets
$(addsuffix .nin,db1.fa): db1.fa
    makeblastdb -dbttype nucl -in $<
$(addsuffix .nin,db2.fa): db2.fa
    makeblastdb -dbttype nucl -in $<
$(addsuffix .nin,db6.fa): db6.fa
    makeblastdb -dbttype nucl -in $<
$(addsuffix .nin,db7.fa): db7.fa
    makeblastdb -dbttype nucl -in $<
(...)
T+= t46.blast
t46.blast : $(addsuffix .nin,db7.fa) f3.fa
    blastn -db db7.fa -out $@ -evalue 1 -query f3.fa -perc_identity 80
T+= t47.blast
t47.blast : $(addsuffix .nin,db7.fa) f3.fa
    blastn -db db7.fa -out $@ -evalue 0.7 -query f3.fa -perc_identity 70
T+= t48.blast
t48.blast : $(addsuffix .nin,db7.fa) f3.fa
    blastn -db db7.fa -out $@ -evalue 0.7 -query f3.fa -perc_identity 80
all_targets: ${T}

You can pipe the output into GNU make and run in a multi threaded environment (option -j of make):

(not tested)

$ jrunscript input.js | make -f - -j 10

makeblastdb -dbttype nucl -in db1.fa
blastn -db db1.fa -out t1.blast -evalue 1 -query f1.fa -perc_identity 70
blastn -db db1.fa -out t2.blast -evalue 1 -query f1.fa -perc_identity 80
blastn -db db1.fa -out t3.blast -evalue 0.7 -query f1.fa -perc_identity 70
blastn -db db1.fa -out t4.blast -evalue 0.7 -query f1.fa -perc_identity 80
makeblastdb -dbttype nucl -in db2.fa
blastn -db db2.fa -out t5.blast -evalue 1 -query f1.fa -perc_identity 70
blastn -db db2.fa -out t6.blast -evalue 1 -query f1.fa -perc_identity 80
blastn -db db2.fa -out t7.blast -evalue 0.7 -query f1.fa -perc_identity 70
blastn -db db2.fa -out t8.blast -evalue 0.7 -query f1.fa -perc_identity 80
makeblastdb -dbttype nucl -in db6.fa
blastn -db db6.fa -out t9.blast -evalue 1 -query f1.fa -perc_identity 70
blastn -db db6.fa -out t10.blast -evalue 1 -query f1.fa -perc_identity 80
blastn -db db6.fa -out t11.blast -evalue 0.7 -query f1.fa -perc_identity 70
blastn -db db6.fa -out t12.blast -evalue 0.7 -query f1.fa -perc_identity 80
makeblastdb -dbttype nucl -in db7.fa
blastn -db db7.fa -out t13.blast -evalue 1 -query f1.fa -perc_identity 70
blastn -db db7.fa -out t14.blast -evalue 1 -query f1.fa -perc_identity 80
blastn -db db7.fa -out t15.blast -evalue 0.7 -query f1.fa -perc_identity 70
blastn -db db7.fa -out t16.blast -evalue 0.7 -query f1.fa -perc_identity 80
blastn -db db1.fa -out t17.blast -evalue 1 -query f2.fa -perc_identity 70
blastn -db db1.fa -out t18.blast -evalue 1 -query f2.fa -perc_identity 80
blastn -db db1.fa -out t19.blast -evalue 0.7 -query f2.fa -perc_identity 70
blastn -db db1.fa -out t20.blast -evalue 0.7 -query f2.fa -perc_identity 80
blastn -db db2.fa -out t21.blast -evalue 1 -query f2.fa -perc_identity 70
blastn -db db2.fa -out t22.blast -evalue 1 -query f2.fa -perc_identity 80
blastn -db db2.fa -out t23.blast -evalue 0.7 -query f2.fa -perc_identity 70
blastn -db db2.fa -out t24.blast -evalue 0.7 -query f2.fa -perc_identity 80
blastn -db db6.fa -out t25.blast -evalue 1 -query f2.fa -perc_identity 70
blastn -db db6.fa -out t26.blast -evalue 1 -query f2.fa -perc_identity 80
blastn -db db6.fa -out t27.blast -evalue 0.7 -query f2.fa -perc_identity 70
blastn -db db6.fa -out t28.blast -evalue 0.7 -query f2.fa -perc_identity 80
(...)
ADD COMMENT
0
Entering edit mode

@Pierre: Problem is are the jobs submitted this way going to play nice (i.e. wait until the first one completes). @tcf.hcdg wants to do this on a standalone computer.

ADD REPLY
0
Entering edit mode

on a standalone cumputer: don't use the option '-j'

ADD REPLY
0
Entering edit mode

I tried to run "js" file with the following option, but its giving the following error.. Actually I haven't work on java before that therefore haven't even basic information (aploygy if I made basic mistake in running command)

C:\myprog\blast-2.2.31+>jrunscript -e -l js -f input.js
'jrunscript' is not recognized as an internal or external command,
operable program or batch file.

C:\myprog\blast-2.2.31+>js
'js' is not recognized as an internal or external command,
operable program or batch file.

I have BLAST installed locally on my "windows computer"

ADD REPLY
0
Entering edit mode

You will have to install Java Development kit (JDK) for Windows to use jrunscript.

ADD REPLY

Login before adding your answer.

Traffic: 2689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6