Hello everyone,
I would like to ask how conditional and independent e-values generated by HMMscan (domain search) are used.
The HMMER manual defines the two as follows:
"(12) c-Evalue: The “conditional E-value”, a permissive measure of how reliable this particular domain may be. The conditional Evalue is calculated on a smaller search space than the independent E-value. The conditional E-value uses the number of targets that pass the reporting thresholds. The null hypothesis test posed by the conditional E-value is as follows. Suppose that we believe that there is already sufficient evidence (from other domains) to identify the set of reported sequences as homologs of our query; now, how many additional domains would we expect to find with at least this particular domain’s bit score, if the rest of those reported sequences were random nonhomologous sequence (i.e. outside the other domain(s) that were sufficient to identified them as homologs in the first place)?"
"(13) i-Evalue: The “independent E-value”, the E-value that the sequence/profile comparison would have received if this were the only domain envelope found in it, excluding any others. This is a stringent measure of how reliable this particular domain may be. The independent E-value uses the total number of targets in the target database."
As I'm not too knowledgeable on how HMMER works, I'm unsure how these two metrics functionally differ from one another. What I do understand (and correct me if I am wrong), is that they are measures to tell if the domain hit is reliable, similar to how BLAST E-value functions.
QUESTION: In what case would one choose to use the conditional E-value over the independent E-value? Why are there two E-values in the first place?
Thank you.
P.S. If anyone would be so kind to recommend reading materials on the topic, I would be very grateful!