PSSM files - relationship between HHBlits .MTX file and Blast .PSSM file?
1
0
Entering edit mode
14 months ago

I'm looking at the outputs of psi-Blast and HHBlits - each produces a PSSM file, but in different formats, with different numbers. I was wondering if there's a simple one-to-one correspondence between the two (i.e. is it possible to interconvert), or if there's information in one that is not in the other. From David Jones' 1999 JMB paper (Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices) I can get a good idea of the contents of the psi-Blast file, but I haven't managed to get a propr description of the MTX file from HHBLits.

From looking at files produced from the same (or what I hope is the same) comparison, the order of the columns is different (alphabetic according to 3-letter code in psi-Blast, alphabetic according to 1-letter code in HHBlits), but the numbers are quite different; the first group of values in the psi-Blast are related to BLOSUM matrix scores (e.g. -7 -7 -8 -9 -7 -6...) but in HHBlits there are three columnss (that don't seem to be directly to do with the comparison (I must be missing something here)) then 20 columns that look like they are scores (e.g. [re-ordered to three-letter-alphabetic] -300 0 -100 200 0 100...) then 3 more columns at the end that always seem to be "-32768 -32768 -32768".

I suppose my real question is - can I produce a pseudo-psi-Blast style PSSM file from an HHBlits-style MTX file - but I'm also interested in a fuller description of what's actually in the files. There must be resources somewhere that describe them - but I haven't managed to find them!

alignment HHBlits psi-Blast • 466 views
ADD COMMENT
0
Entering edit mode
14 months ago

Okay, it looks like I was looking at the wrong program (I'm working on a script that replaces psiBlast with HHBlits). My confusion arose because the "addss.pl" script (which Soedinglab supply but recommend you don't use :-)) produces an MTX file if the input is HMMER format.

It turns out that the program that I was actually using to produce the .MTX file is actually "seq2mtx", which is part of the PSIPRED suite (see David Jones in my original post...). Since I have the C source code of seq2mtx (which contains an array with BLOSUM62) it's possible to work out exactly what the output MTX file contains, and what the numbers mean.

ADD COMMENT

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6