Question: (Closed) What Substitution Matrices Are Available For Blast?
0
gravatar for Ryan Thompson
8.1 years ago by
Ryan Thompson3.4k
TSRI, La Jolla, CA
Ryan Thompson3.4k wrote:

I'm having trouble locating documentation on what substitution matrices are supported by command-line BLAST+ tools. Do they have a specific set of matrices built in? Are they hidden in a directory somewhere? Can I add my preferred one if it is not supported by default?

In particular, I'd like to use the BLOSUM100 matrix, because I want higher mismatch penalties.

blast • 4.2k views
ADD COMMENTlink modified 8.1 years ago by Pierre Lindenbaum125k • written 8.1 years ago by Ryan Thompson3.4k

I think it's a duplicate of http://biostar.stackexchange.com/questions/502/custom-matrices-with-ncbi-blast . Tell me if the answer for blast++ could be different for blast.

ADD REPLYlink modified 6 weeks ago by RamRS25k • written 8.1 years ago by Pierre Lindenbaum125k

It's partially a duplicate, but I'm also looking for a list of what matrices are supported by default.

ADD REPLYlink written 8.1 years ago by Ryan Thompson3.4k

As noted in Custom Matrices With Ncbi Blast you can get NCBI BLAST to tell you the available scoring matrices by specifying one that does not exist. This also works with NCBI BLAST+:

$ blastp -query fastaSeq -subject fastaSeq -matrix unknown
BLAST query/options error: unknown is not a supported matrix, supported matrices are:
BLOSUM80 
BLOSUM62 
BLOSUM50 
BLOSUM45 
PAM250 
BLOSUM90 
PAM30 
PAM70

Note that BLAST uses pre-calculated vales for some of the statistics, and this constrains the combinations of matrix and gap penalties or match/mismatch scores and gap penalties available. I can not find the details of these on the current NCBI BLAST pages, however the Internet Archive has captured a copy of the relevant new article with the tables detailing the supported value combinations: http://web.archive.org/web/20070121032949/http://www.ncbi.nlm.nih.gov/blast/blast_whatsnew.shtml#20051206.2

Other versions of BLAST handle the statistics differently and may provide support for additional matrices. For example WU-BLAST/AB-BLAST support a wider range of scoring matrices (see http://www.ebi.ac.uk/Tools/sss/wublast/help/index-protein.html#matrix). Other sequence similarity search tools may support arbitrary values for these parameters since they attempt to derive the statistics from scratch for each search (for example the programs in the FASTA suite).

ADD REPLYlink modified 6.7 years ago • written 6.7 years ago by Hamish3.1k

On my Linux system, the matrices are stored in /usr/share/ncbi/data. I have BLOSUM 45, 50, 62, 80, 90 and PAM 30, 70, 250.

ADD REPLYlink written 8.1 years ago by Neilfws48k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour