blast error invalid query/ sequence/ filtering options. What is the problem?
1
0
Entering edit mode
8.4 years ago
john ▴ 130

Hello people

I'm trying to run blast (blast-2.2.26) on a fasta file. But I get this error message:

[blastall] WARNING: MaulwurfLeber_H21F7XH01DFTZ0_rank=0133853_x=1293_: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options

I call blast like this:

blastall -p blastp -i MaulwurfLeber_prots.fasta -d Pool_new_unclustered -o Contigs_prots_vs_New_unclustered.tab -a 8 -m 8 -e 0.001

The reads which causes the error are the following:

>MaulwurfLeber_H21F7XH01DFTZ0_rank=0133853_x=1293_0_y=1550_0_length=353_-gene_1
VEIGEVVVFGEVETVVGEAVEVEAGEVVEVEVGEVEVGEVVVGEVVVV
>MaulwurfLeber_H21F7XH01DFTZ0_rank=0133853_x=1293_0_y=1550_0_length=353_-gene_2
VRWWSVRWWSFEEVKVVVGEVEVVVGEAVEVEISEVEVGEWSR

The fasta file was created with hmmsearch called like this:

hmmsearch --tblout Contigs_prots_vs_PFAMa.tab --cpu 8 -o Contigs_prots_vs_PFAMa.out --noali Pfam-A.hmm MaulwurfLeber_prots.fasta

My calls are based on this scripts from VirSorter.

https://github.com/simroux/VirSorter

I'm running the script on a cluster.

blastp blast fasta • 8.9k views
ADD COMMENT
1
Entering edit mode

Even though this should not be happening I wonder if blast is not liking the = and - characters in the fasta header. Can you try replacing them with an "_" and see if that helps.

You are not using the latest blast so if possible upgrade.

ADD REPLY
1
Entering edit mode
8.4 years ago

This message is a warning, it shouldn't be fatal to the execution. It can happen that for some choices of blast parameters and some sequences, the statistics can't be calculated. In your case, my guess is that it's because your peptide sequences are highly repetitive.

ADD COMMENT
0
Entering edit mode

Thank you for your quick answer.

If it is not a fatale error, than something else is wrong with the call or my db. Because the output file is empty.

ADD REPLY
1
Entering edit mode

It can also happen when there are unrecognized/unacceptable characters in the sequences. Check the database to make sure it only has valid amino-acid characters. However, remember that an empty result file without any fatal error message could also mean that there are no results to be had. Given that your sequences are highly repetitive, this is likely if you have some filtering turned on, which I seem to remember blastall does by default. I think the option to turn this off is -F F.

ADD REPLY

Login before adding your answer.

Traffic: 1593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6