I am using the software hhsuite from Dr. Soeding (http://toolkit.tuebingen.mpg.de/hhblits) for my research projects. I have a very specific question regarding the output it generates, inputs from anybody who has used the software will help greatly.
For a query to search the nr20 database using hhblits, the output .hhr file outputs the hits against clusters of proteins hits in the database. How can I get the profile-profile similarity score for each protein within each cluster to get an ordering of the proteins that are closer homologs to the query?
For example (please refer to hhblits output below): For cluster NR20|XULHUQABA, the database a3m file (nr20_xxx_a3m_db has 17 proteins. The output .hhr file contains only a single profile alignment against the consensus sequence of NR20|XULHUQABA (profile-profile similarity score is 1233.89). The other hits are for other clusters.
How can we get all the pairwise profile-profile scores of our query sequence against all 17 proteins in cluster NR20|XULHUQABA?
Thanks
Example output from running hhblits server (http://toolkit.lmb.uni-muenchen.de/hhblits/results/5323588):
Query Acetylcholine_receptor_protein_delta_chain_Torpedo_californica (seq=MGNIHFVYLL...DYSSDHPRCA Len=522 Neff=1.0 Nseqs=1)
Parameters search:local realign with MAC:yes MAC threshold=0.35
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 NR20|XULHUQABA|16|534 Neuronal 100.0 3E-166 4E-171 1233.9 0.0 496 16-520 35-532 (534)
2 NR20|GODXONABE|128|541 Acetylc 100.0 8E-164 1E-168 1219.9 0.0 493 10-521 30-524 (541)
3 NR20|XOZLIBEBA|11|444 Chain C, 100.0 8E-138 8E-143 1014.5 0.0 388 18-521 54-443 (444)
4 NR20|SANVERABE|40|532 Acetylch 100.0 4E-116 5E-121 876.3 0.0 463 18-516 46-529 (532)
5 NR20|GUTVEBEBA|2|417 PREDICTED 100.0 5E-111 4E-116 812.4 0.0 401 110-521 1-416 (417)
6 NR20|YIXVUYEBA|4|248 Chain B, 100.0 2E-106 2E-111 750.6 0.0 247 246-504 1-247 (248)
..
Followed by profile alignments, one per cluster:
>NR20|XULHUQABA|16|534 Neuronal acetylcholine receptor subunit alpha-3; Acetylcholine receptor subunit
epsilon; Neuronal acetylcholine receptor subunit eat-2; Neuronal acetylcholine receptor subunit alpha-2;
Flags: Precursor; Neuronal acetylcholine receptor subunit beta-2. [Xenopus laevis]|148227704 [Equus caballus
|194217616 [Ciona intestinalis]|198429844|320089451|198429842|320089453 [Meleagris gallopavo]|326926016
[Taeniopygia guttata]|224060672 [Torpedinoidei]|113101|64394|39653647|39653655|39653653 [Canis lupus
familiaris]|73994148 [Danio rerio]|122890898 [Takifugu rubripes]|31559081.
Probab=100.00 E-value=3.4e-166 Score=1233.89 Aligned_cols=496 Identities=54% Similarity=0.949 Sum_probs=465.1
Q Acetylcholine_ 16 YSGCSGVNEEERLINDLLIVNKYNKHVRPVKHNNEVVNIALSLTLSNLISLKETDETLTSNVWMDHAWYDHRLTWNASEY 95 (522)
Q Consensus 16 ysgcsgvneeerlindllivnkynkhvrpvkhnnevvnialsltlsnlislketdetltsnvwmdhawydhrltwnasey 95 (522)
-+|-.+.|||+|||++|+ +.|||.+||++|-++.|..-+.|||+|||||+|.+||+|+||||+++|-|.||.||.|+|
T Consensus 35 l~~~~~~n~E~rLi~~Lf--~~Yn~~vRP~~~~d~kv~V~v~lTLtnLISLnEkeE~ltTnVwie~~W~DyRl~Wn~s~y 112 (534)
T NR20|XULHUQABA 35 LDLSVRSNEEGRLISYLF--EGYNKRVRPARKKDDKVDVSVKLTLTNLISLNEKEETLTTNVWIEIQWTDYRLSWNPSEY 112 (534)
Confidence 344444699999999987 569999999999999999999999999999999999999999999999999999999999
..