Hi everyone,
I just ran my first hhblits (hhblits -cpu 4 -M first -i MSA/g_1.fa.out -d my_databases/my_db) and I noticed there are multiple hits to the same cluster in my results file (for e.g. see column 2 below). I'm guessing this represents different domains with homology to my query MSA that are all significant, but i wanted to double check if this makes sense. Anyone run this before and seen a similar output?
 No   Hit          Prob   E-value P-value  Score  SS  Cols  Query HMM  Template HMM
  1 cluster_id_124 100.0   1E-42 6.7E-46  242.0   0.0  201   13-221   101-350 (396)
  2 cluster_id_124 100.0 1.6E-42   1E-45  241.0   0.0  202    7-219    48-261 (396)
  6 cluster_id_124 100.0 9.2E-37 6.1E-40  211.5   0.0  198   11-218   142-391 (396)
Also, my database is made up of ~2k HMMs, why then in the output results file, I see that there is only 136 searched HMMs?
Query         g_1
Match_columns 229
No_of_seqs    1529 out of 22987
Neff          11.9485
Searched_HMMs 136
Thank you for any input.
Is this from a custom database?
The output looks reasonable at a glance, but I’ve not seen
cluster_id_xxxbefore. I typically usehhsearchtoo, so there could be some difference in the program that I’m not accounting for.I usually run my searches against the PDB, so I get PDB hits back.
Yes, this is from a custom database. Each HMM in my database is produced from a multiple sequence alignment of an ortholog group.
Do you also see duplicate hits when you used PDB?