Question: Custom Matrices With Ncbi Blast
8
gravatar for hadasa
9.2 years ago by
hadasa1.0k
hadasa1.0k wrote:

Does anyone have an idea on how to make NCBI BLAST work with custom Matrices? i.e. ones that are not provided by the BLOSUM series that come as a default with NCBI BLAST.

matrix blast • 5.2k views
ADD COMMENTlink modified 4.9 years ago by Biostar ♦♦ 20 • written 9.2 years ago by hadasa1.0k

I really don't think it is wise to hard code the matrices within BLAST. Maybe there should be an option to use another matrix.

ADD REPLYlink written 9.2 years ago by hadasa1.0k

Due to the use of precalculated values for some of the statistics NCBI BLAST (and NCBI BLAST+) only supports some combinations of matrix and gap penalties (see http://web.archive.org/web/20070121032949/http://www.ncbi.nlm.nih.gov/blast/blast_whatsnew.shtml#20051206.2). If you want to use arbitary values for these then look at other BLAST implementations (WU-BLAST/AB-BLAST supports more matrices) or other sequence similarity search tools. For example the FASTA suite programs derive the statistics and so don't have these constraints (although this does mean it is possible to perform meaningless searches).

ADD REPLYlink written 6.1 years ago by Hamish3.1k
8
gravatar for Jarretinha
9.2 years ago by
Jarretinha3.3k
São Paulo, Brazil
Jarretinha3.3k wrote:

There is a very dirty trick to do that. You just need to name your custom matrix as a current supported matrix and put it int the matrices dir. That's the solution you can find at NCBI. It works. But, beware! Defaults are now a problem.

You can check additional details in:

http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastall.html#5

I'm not sure about the blast version in this site. In my box, I have ncbi-tools (ubuntu pkg) installed and, for example, PAM30 is in /usr/share/ncbi/data/PAM30. Got the idea?

ADD COMMENTlink written 9.2 years ago by Jarretinha3.3k

thanks! someone mentioned to me about a -V T parameter using an 'old blast engine' have not tried it yet

ADD REPLYlink written 9.2 years ago by hadasa1.0k

yeah that's really dirty lol!

ADD REPLYlink written 9.2 years ago by hadasa1.0k

Unfortunately, it does not work when using nucleotide blast (-p blastn).

ADD REPLYlink written 8.4 years ago by In The Hope Of A Better World70

It should work for every ncbi app. You just need to know which matrices it's using and where they are. I've tested with nucleotides and protein. You should recheck your paths.

ADD REPLYlink written 8.4 years ago by Jarretinha3.3k

Sorry, but you might be wrong. The nucleotide matrix is hard-coded in the source files with no option to use or replace any file.

ADD REPLYlink written 8.4 years ago by In The Hope Of A Better World70

While the template matrix for nucleotide searches is hard-coded, the scaling is not, and is controlled via the match/mismatch parameters.

ADD REPLYlink written 6.1 years ago by Hamish3.1k

In later versions of BLAST this is true (2.2.20 onwards). However we successfully applied this trick by using version 2.2.13. This allowed us to successfully replaced the required matrices.

ADD REPLYlink written 8.3 years ago by Niallhaslam2.2k

Anyway, even in BLAST+ > 2.2.13 the is also applicable in its source code form. You can add any matrices you like. Just need to modify some header files. Tested it here with blast+-2.2.24 and works fine! I'll update may answer ASAP.

ADD REPLYlink written 8.3 years ago by Jarretinha3.3k
3
gravatar for Neilfws
9.2 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

I can tell you what does not work and suggest a possible solution.

When BLAST is installed locally on a Linux system from an NCBI package, the matrices are stored in /usr/share/ncbi/data, as plain text files. So, I tried copying the BLOSUM62 matrix to a new file named "BLOSUM00", then running blastall as:

blastall -p blastp -d nr -i myseq.fa -M BLOSUM00

And I got this error message:

Searching[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM00 is not a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM80 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM62 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM50 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM45 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM250 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM62_20 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: BLOSUM90 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM30 is a supported matrix
[blastall] ERROR: Q02066.1: BlastKarlinBlkGappedCalc: PAM70 is a supported matrix

This indicates to me that the BLAST matrices are hard-coded in the BLAST source code. One solution might be to download the BLAST source code, find the file related to "BlastKarlinBlkGappedCalc", edit the source and see if you can compile BLAST.

ADD COMMENTlink modified 9 months ago by RamRS22k • written 9.2 years ago by Neilfws48k

That's the point. Looking for simple ways to do it. Have not come across an easy way of recompiling BLAST.

ADD REPLYlink written 9.2 years ago by hadasa1.0k
1
gravatar for Eanna O' Keefe
8.4 years ago by
Eanna O' Keefe10 wrote:

Hi, I am having this same problem. First I tried just renaming my matrix as a default BLAST matrix (e.g BLOSUM45). It ran fine but the results I got were identical to those produced by using the actual BLOSUM45 matrix. I tried downloading the BLAST source code, then editing the BLOSUM45 Matrix and then recompiling but it still produced the same output as using the regular BLOSUM45 matrix. Is there anyone here who has actually got a custom matrix working for BLAST. I'd really appreciate any help you could give me.

ADD COMMENTlink written 8.4 years ago by Eanna O' Keefe10
1
gravatar for colinDotAIBN
6.4 years ago by
colinDotAIBN20
Brisbane
colinDotAIBN20 wrote:

If anyone comes across this. You can add additional matrices to BLAST by adding them to blast_stat.c (src\algo\blast\core)

How to add a new matrix to blast_stat.c:

To add a new matrix to blast_stat.c it is necessary to complete four steps. As an example consider adding the matrix called TESTMATRIX

1.) add a define specifying how many different existence and extensions penalties are allowed, so it would be necessary to add the line:

#define TESTMATRIX_VALUES_MAX 14

if 14 values were to be allowed.

2.) add a two-dimensional array to contain the statistical parameters:

static array_of_8 testmatrix_values[TESTMATRIX_VALUES_MAX] ={ ...

3.) add a "prefs" array that should hint about the "optimal" gap existence and extension penalties:

static Int4 testmatrix_prefs[TESTMATRIX_VALUES_MAX] = {
  BLAST_MATRIX_NOMINAL, ...
};

4.) Go to the function BlastLoadMatrixValues (in this file) and add two lines before the return at the end of the function:

matrix_info = MatrixInfoNew("TESTMATRIX", testmatrix_values, testmatrix_prefs, TESTMATRIX_VALUES_MAX);
ListNodeAddPointer(&retval, 0, matrix_info);
ADD COMMENTlink modified 9 months ago by RamRS22k • written 6.4 years ago by colinDotAIBN20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1273 users visited in the last hour