Degenerate/combinatorial protein BLAST search
1
0
Entering edit mode
2.2 years ago
Albert • 0

Hi,

I was wondering whether there's a way to perform protein BLAST searches allowing for particular positions to be degenerate. Nucleotide BLAST allows the usage of the IUPAC degenerate namings (N = A/C/G/T; K = G/T; and so on). For instance, say I have the following canonical sequence:

KRSCATTMG

And I want to perform a BLAST search with the first position being either K or R, and positions 6 and 7 being either T or S.

I'm aware BLAST allows for certain variability within the sequence, what I need to do is biasing such allowance in particular positions towards particular amino acids, while not allowing variability in other positions (in my example, for instance, I need position 4 to be a C).

Maybe a faster approach would be to just download BLAST databases and conditional filter through a python script?

BLAST • 675 views
ADD COMMENT
1
Entering edit mode
2.2 years ago
Mensur Dlakic ★ 27k

Even if BLAST can do what you want, it is most likely not the best choice for it, especially if you have short search strings.

You may want to check out regular expressions like the ones used in PROSITE. For what you want, the pattern would be something like [KR]-R-S-C-A-[ST](2)-MG. There are programs that can search databases with these patterns:

https://ftp.expasy.org/databases/prosite/ps_scan/

ADD COMMENT
0
Entering edit mode

Thanks a lot, your answer was really helpful and I could get what I needed using ScanProsite.

ADD REPLY

Login before adding your answer.

Traffic: 1799 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6