Question: Hmmscan Bias Values Greater than One
0
gravatar for pld
5.9 years ago by
pld4.8k
United States
pld4.8k wrote:

My understanding is that the bias value of the hmmscan output is supposed to rage between 0 and 1:

  • Bias - The bias composition correction (ranging between 0 and 1), is the bit score difference contributed by the null2 model. High bias scores may be a red flag for a false positive. It is difficult to correct for all possible ways in which nonrandom but nonhomologous biological sequences can appear to be similar, such as short-period tandem repeats, so there are cases where the bias correction is not strong enough (creating false positives).

 http://hmmer.janelia.org/help/result

However for many hits that I am seeing against the PfamA database, I see biases above 1 for both the "full sequence" and "this domain" categories. Am I missing something?

Here's the command I used

hmmscan -domtblout <out fi> -cpu 22 PfamA.hmm <input data> > hmmscan.log

The version is 3.1b1

 

hmmer hmmscan pfam • 2.2k views
ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by pld4.8k

From the user guide:

The next number, the bias, is a correction term for biased sequence composition that has been applied
to the sequence bit score.1 For instance, for the top hit MYG PHYCA that scored 222.7 bits, the bias of 3.2
bits means that this sequence originally scored 225.9 bits, which was adjusted by the slight 3.2 bit biasedcomposition
correction. The only time you really need to pay attention to the bias value is when it’s large,
on the same order of magnitude as the sequence bit score.

After reading this part, it makes more sense why the bias value could be above one, but now I'm not sure why the documentation on the webpage says it is.

ADD REPLYlink written 5.9 years ago by pld4.8k
1
gravatar for pld
5.9 years ago by
pld4.8k
United States
pld4.8k wrote:

The user guide example and definition is correct. The bias field in hmmer results is defined as the difference in bit score after applying the bias correction and can be of values greater than one.

I just heard back from them, they are fixing the website now.

ADD COMMENTlink written 5.9 years ago by pld4.8k
1
gravatar for Siva
5.9 years ago by
Siva1.7k
United States
Siva1.7k wrote:

I checked the HMMER3 User guide and found the following footnote on Page 18

The method that HMMER3 uses to compensate for biased composition is unpublished, and different from HMMER2. We will write it up when there’s a chance.

The example hmmsearch result on the same page has bias values greater than 1.

--- full sequence ---   --- best 1 domain ---    -#dom-
           E-value  score  bias    E-value  score  bias    exp  N  Sequence              Description
           ------- ------ -----    ------- ------ -----   ---- --  --------              -----------

             6e-65  222.7   3.2    6.7e-65  222.6   2.2    1.0  1  sp|P02185|MYG_PHYCA   Myoglobin OS=Physeter catodon GN=MB PE
           3.1e-63  217.2   0.1    3.4e-63  217.0   0.0    1.0  1  sp|P02024|HBB_GORGO   Hemoglobin subunit beta OS=Gorilla gor

I have not used HMMER3 yet and it seems there are major changes compared to HMMER2.

ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by Siva1.7k

That's not very comforting. I guess I'll email them.

ADD REPLYlink modified 5.9 years ago • written 5.9 years ago by pld4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 866 users visited in the last hour