Question: (Closed) What Substitution Matrices Are Available For Blast?

0

Ryan Thompson •

**3.4k**wrote:I'm having trouble locating documentation on what substitution matrices are supported by command-line BLAST+ tools. Do they have a specific set of matrices built in? Are they hidden in a directory somewhere? Can I add my preferred one if it is not supported by default?

In particular, I'd like to use the BLOSUM100 matrix, because I want higher mismatch penalties.

ADD COMMENT
• link
•
modified 8.1 years ago
by
Pierre Lindenbaum ♦

**125k**• written 8.1 years ago by Ryan Thompson •**3.4k**
I think it's a duplicate of http://biostar.stackexchange.com/questions/502/custom-matrices-with-ncbi-blast . Tell me if the answer for blast++ could be different for blast.

25k• written 8.1 years ago by Pierre Lindenbaum ♦125kIt's partially a duplicate, but I'm also looking for a list of what matrices are supported by default.

3.4kAs noted in Custom Matrices With Ncbi Blast you can get NCBI BLAST to tell you the available scoring matrices by specifying one that does not exist. This also works with NCBI BLAST+:

Note that BLAST uses pre-calculated vales for some of the statistics, and this constrains the combinations of matrix and gap penalties or match/mismatch scores and gap penalties available. I can not find the details of these on the current NCBI BLAST pages, however the Internet Archive has captured a copy of the relevant new article with the tables detailing the supported value combinations: http://web.archive.org/web/20070121032949/http://www.ncbi.nlm.nih.gov/blast/blast_whatsnew.shtml#20051206.2

Other versions of BLAST handle the statistics differently and may provide support for additional matrices. For example WU-BLAST/AB-BLAST support a wider range of scoring matrices (see http://www.ebi.ac.uk/Tools/sss/wublast/help/index-protein.html#matrix). Other sequence similarity search tools may support arbitrary values for these parameters since they attempt to derive the statistics from scratch for each search (for example the programs in the FASTA suite).

3.1kOn my Linux system, the matrices are stored in /usr/share/ncbi/data. I have BLOSUM 45, 50, 62, 80, 90 and PAM 30, 70, 250.

48k