**1.0k**wrote:

Does anyone have an idea on how to make NCBI BLAST work with custom Matrices? i.e. ones that are not provided by the BLOSUM series that come as a default with NCBI BLAST.

Question: Custom Matrices With Ncbi Blast

8

hadasa • **1.0k** wrote:

Does anyone have an idea on how to make NCBI BLAST work with custom Matrices? i.e. ones that are not provided by the BLOSUM series that come as a default with NCBI BLAST.

8

Jarretinha ♦ **3.3k** wrote:

There is a very dirty trick to do that. You just need to name your custom matrix as a current supported matrix and put it int the matrices dir. That's the solution you can find at NCBI. It works. But, beware! Defaults are now a problem.

You can check additional details in:

http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastall.html#5

I'm not sure about the blast version in this site. In my box, I have ncbi-tools (ubuntu pkg) installed and, for example, PAM30 is in /usr/share/ncbi/data/PAM30. Got the idea?

It should work for every ncbi app. You just need to know which matrices it's using and where they are. I've tested with nucleotides and protein. You should recheck your paths.

Sorry, but you might be wrong. The nucleotide matrix is hard-coded in the source files with no option to use or replace any file.

In later versions of BLAST this is true (2.2.20 onwards). However we successfully applied this trick by using version 2.2.13. This allowed us to successfully replaced the required matrices.

3

Neilfws ♦ **48k** wrote:

I can tell you what does not work and suggest a possible solution.

When BLAST is installed locally on a Linux system from an NCBI package, the matrices are stored in /usr/share/ncbi/data, as plain text files. So, I tried copying the BLOSUM62 matrix to a new file named "BLOSUM00", then running blastall as:

```
blastall -p blastp -d nr -i myseq.fa -M BLOSUM00
```

And I got this error message:

```
Searching[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM00 is not a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM80 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM62 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM50 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM45 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM250 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM62_20 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM90 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM30 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM70 is a supported matrix
```

This indicates to me that the BLAST matrices are hard-coded in the BLAST source code. One solution might be to download the BLAST source code, find the file related to "BlastKarlinBlkGappedCalc", edit the source and see if you can compile BLAST.

1

Eanna O' Keefe • **10** wrote:

Hi, I am having this same problem. First I tried just renaming my matrix as a default BLAST matrix (e.g BLOSUM45). It ran fine but the results I got were identical to those produced by using the actual BLOSUM45 matrix. I tried downloading the BLAST source code, then editing the BLOSUM45 Matrix and then recompiling but it still produced the same output as using the regular BLOSUM45 matrix. Is there anyone here who has actually got a custom matrix working for BLAST. I'd really appreciate any help you could give me.

1

colinDotAIBN • **20** wrote:

If anyone comes across this. You can add additional matrices to BLAST by adding them to blast_stat.c (src\algo\blast\core)

To add a new matrix to blast_stat.c it is necessary to complete four steps. As an example consider adding the matrix called TESTMATRIX

1.) add a define specifying how many different existence and extensions penalties are allowed, so it would be necessary to add the line:

```
#define TESTMATRIX_VALUES_MAX 14
```

if 14 values were to be allowed.

2.) add a two-dimensional array to contain the statistical parameters:

```
static array_of_8 testmatrix_values[TESTMATRIX_VALUES_MAX] ={ ...
```

3.) add a "prefs" array that should hint about the "optimal" gap existence and extension penalties:

```
static Int4 testmatrix_prefs[TESTMATRIX_VALUES_MAX] = {
BLAST_MATRIX_NOMINAL, ...
};
```

4.) Go to the function BlastLoadMatrixValues (in this file) and add two lines before the return at the end of the function:

```
matrix_info = MatrixInfoNew("TESTMATRIX", testmatrix_values, testmatrix_prefs, TESTMATRIX_VALUES_MAX);
ListNodeAddPointer(&retval, 0, matrix_info);
```

ADD COMMENT
• link
modified 21 months ago
by
RamRS ♦ **27k**
•
written
7.3 years ago by
colinDotAIBN • **20**

Please log in to add an answer.

Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.

Powered by Biostar
version 2.3.0

Traffic: 1095 users visited in the last hour

I really don't think it is wise to hard code the matrices within BLAST. Maybe there should be an option to use another matrix.

1.0kDue to the use of precalculated values for some of the statistics NCBI BLAST (and NCBI BLAST+) only supports some combinations of matrix and gap penalties (see http://web.archive.org/web/20070121032949/http://www.ncbi.nlm.nih.gov/blast/blast_whatsnew.shtml#20051206.2). If you want to use arbitary values for these then look at other BLAST implementations (WU-BLAST/AB-BLAST supports more matrices) or other sequence similarity search tools. For example the FASTA suite programs derive the statistics and so don't have these constraints (although this does mean it is possible to perform meaningless searches).

3.1k