Question: (Closed) What Substitution Matrices Are Available For Blast?
gravatar for Ryan Thompson
9.1 years ago by
Ryan Thompson3.4k
TSRI, La Jolla, CA
Ryan Thompson3.4k wrote:

I'm having trouble locating documentation on what substitution matrices are supported by command-line BLAST+ tools. Do they have a specific set of matrices built in? Are they hidden in a directory somewhere? Can I add my preferred one if it is not supported by default?

In particular, I'd like to use the BLOSUM100 matrix, because I want higher mismatch penalties.

blast • 4.7k views
ADD COMMENTlink modified 9.1 years ago by Pierre Lindenbaum132k • written 9.1 years ago by Ryan Thompson3.4k

I think it's a duplicate of . Tell me if the answer for blast++ could be different for blast.

ADD REPLYlink modified 13 months ago by _r_am32k • written 9.1 years ago by Pierre Lindenbaum132k

It's partially a duplicate, but I'm also looking for a list of what matrices are supported by default.

ADD REPLYlink written 9.1 years ago by Ryan Thompson3.4k

As noted in Custom Matrices With Ncbi Blast you can get NCBI BLAST to tell you the available scoring matrices by specifying one that does not exist. This also works with NCBI BLAST+:

$ blastp -query fastaSeq -subject fastaSeq -matrix unknown
BLAST query/options error: unknown is not a supported matrix, supported matrices are:

Note that BLAST uses pre-calculated vales for some of the statistics, and this constrains the combinations of matrix and gap penalties or match/mismatch scores and gap penalties available. I can not find the details of these on the current NCBI BLAST pages, however the Internet Archive has captured a copy of the relevant new article with the tables detailing the supported value combinations:

Other versions of BLAST handle the statistics differently and may provide support for additional matrices. For example WU-BLAST/AB-BLAST support a wider range of scoring matrices (see Other sequence similarity search tools may support arbitrary values for these parameters since they attempt to derive the statistics from scratch for each search (for example the programs in the FASTA suite).

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by Hamish3.2k

On my Linux system, the matrices are stored in /usr/share/ncbi/data. I have BLOSUM 45, 50, 62, 80, 90 and PAM 30, 70, 250.

ADD REPLYlink written 9.1 years ago by Neilfws49k
Please log in to add an answer.
The thread is closed. No new answers may be added.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1131 users visited in the last hour