blastp for short sequences not finding a sequence that I know is in the database
0
0
Entering edit mode
6 months ago
ricfoz ▴ 100

Hello all

I am trying to generate a blastp command that is sensitive enough to find similarity between short query sequences (5-12 amino acids long) in a tailored database of over 8k protein sequences.

I have already came up with a working command:

blastp -task blastp-short -word_size 2 -num_alignments 50 -max_hsps 1 -evalue 60 -db tailored_db -query short_seqs.fasta -out outfile.blastp

In general it works nicely, but there are a couple of sequences that I know that have similarity in two different proteins, but the command only outputs hits to one of them. I have already tried to tweak the -threshold command, and -window_size, with no luck.

As a test, I tried devoiding the original fasta file of the database of the protein giving hits, with the hope that the other one will show some hits, but the command did not yield any hit.

I am trying to tweak the algorithm in order to get all the proteins with similarity, and not just one, and the resources I have found do not help.

Do anyone have any idea which other flags can I tweak in order to make the algorithm more sensitive to other sequences?

any help would be very appreciated.

blastp short • 434 views
ADD COMMENT
0
Entering edit mode

What is the size of proteins in the database that you are searching against?

5-12 AA seem like extremely short queries. Perhaps you should look at pattern matching algorithms like fuzzpro: https://embossgui.sourceforge.net/demo/manual/fuzzpro.html

ADD REPLY
0
Entering edit mode

The proteins range from a couple hundreds amino acids to around a thousand.

They are really short query sequences in deed, but that is the reality of my problem at hand.

Thank you for putting forward this tool, I will check it out, it looks useful. I was looking for a way of tweaking the BLASTP algorithm, but hey, if this emboss tool does the trick it is amazing.

ADD REPLY

Login before adding your answer.

Traffic: 3775 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6