Question: Blast in parallel with edible outfmt
1
gravatar for K.Gee
12 days ago by
K.Gee40
K.Gee40 wrote:

Hello Biostars

I know that I can run blast with with "edible" output by using this command:

blastp -query -FASTA.faa - db MYDATABASE -evalue 1e-5 -outfmt "6 sseqid pident evalue bitscore staxids"

If I want to run the same command in parallel, i am using

ls *.fna| parallel -a - blastp -query {} -db MYDATABASE -evalue 1e-5 -outfmt 6 -out {.}.txt

My issue is when I am trying to combine the outfmt in parallel I got an error and I want to know if I can combine those two commands in one?

I tried:

ls *.fna| parallel -a - blastp -query {} -db MYDATABASE -evalue 1e-5 -outfmt "6 sseqid pident evalue bitscore staxids sscinames sskingdoms stitle" -out {.}.txt

This gives me:

Error: Too many positional arguments (1), the offending value: evalue Error: (CArgException::eSynopsis) Too many positional arguments (1), the offending value: evalue

sorry for my terminology, but I'm a new kid on the block

Thank you in advance

output blast • 146 views
ADD COMMENTlink modified 12 days ago by ATpoint46k • written 12 days ago by K.Gee40

I suspect this is just a matter of how bash and parallel are handling the " differently. Have you tried hard-quoting instead with '?

ADD REPLYlink written 12 days ago by Joe19k

I did but doesnt work either :-(, but thanks for your response anyhow!!!

ADD REPLYlink modified 12 days ago • written 12 days ago by K.Gee40

does the cmdline using only -outfmt 6 work ? (== the problem starts when you add the "6 ...." part ? )

on a side note: I don't see any immediate advantage in using parallel here (if you want to speed it up somehow, run it on multiple threads)

ADD REPLYlink modified 12 days ago • written 12 days ago by lieven.sterck10k

Yes with -outfmt 6 works perfectly!!! My issue starts when I am editing the out format

ADD REPLYlink written 12 days ago by K.Gee40
1

This is almost certainly a quirk of how parallel is treating " I would think, but I can't spot the issue off hand.

ADD REPLYlink written 12 days ago by Joe19k

I tried almost everything before posting my issue. Also I realised that I cannot increase the -num_threads as I am doing in the "normal" blast search, so I guess that it is a bag of the parallel command than a mistake during my typing.

ADD REPLYlink modified 12 days ago • written 12 days ago by K.Gee40

see if this is populating correct queries and then try removing dry-run

$ parallel --dry-run  'blastp -query {} -db MYDATABASE -evalue 1e-5  -outfmt "6 sseqid pident evalue bitscore staxids" -out {.}.txt' ::: *.fna

Also check your parallel version.

ADD REPLYlink modified 11 days ago • written 11 days ago by cpad011215k

Works too!!! thank you very much!

ADD REPLYlink written 11 days ago by K.Gee40
4
gravatar for ATpoint
12 days ago by
ATpoint46k
ATpoint46k wrote:

I second the other suggestions of conflicts with how parallel and bash interpret quotations. I generally find it best to write a function for the command and then feed the function into parallel, avoiding exactly what you experience. I do not see a reason to use -a so leaving it out here (untested, based on your code, I do not know the blast syntax):

function blasty {

  blastp \
    -query "${1}" \
    -out "${1%.fna}".txt \
    -db MYDATABASE -evalue 1e-5 -outfmt "6 sseqid pident evalue bitscore staxids sscinames sskingdoms stitle"

}; export -f blasty

ls *.fna | parallel blasty {}

MYDATABASE is worth checking as well, is this a variable or in this case simply a spaceholder for the actual path to the file (or whatever it is, as said not a blast user myself)?

ADD COMMENTlink modified 11 days ago • written 12 days ago by ATpoint46k

Thank you for the response. I got some question regarding your answer. I paste the command at my terminal but it looks like it missing something as I get a blinking ">" Did I have to remove the \ or I have to run the code in different way?. I m sorry for that but I am still learning bash

ADD REPLYlink written 11 days ago by K.Gee40

Just changed it, I accidentally wrote blastp rather than blasty in the parallel call, please try again. When I paste the function it works, make sure this database variable is correct.

ADD REPLYlink modified 11 days ago • written 11 days ago by ATpoint46k

It works . Thanks a lot !!!!! :-)

ADD REPLYlink written 11 days ago by K.Gee40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1132 users visited in the last hour
_