Question: Is the bioconductor RRHO R package p-value computation for two.sided completely wrong ?

0

Anthony •

**0**wrote:Hi everybody,

I am trying to do some rank-rank hyper-geometric overlap with the RRHO R package. I am using alternative="two.sided".

The log p-value are very strange for multiple reasons (see `R/ExpressionAnalysis.R`

function `numericListOverlap`

):

- the 0 is not taken care of for the computation of the log. Does not -log(pval + eps) -- eps being a small number -- make a better choice?
- some p-values are above one.
`2*the.mean - count`

(see EDIT below) and`log.pval<- -log( phyper(q=lower+tol, m=a, n=n-a+1, k=b, lower.tail=TRUE) + phyper(q= upper-tol, m=a, n=n-a+1, k=b, lower.tail=FALSE))`

are meaningless for me. I think that~~the former should be replaced by~~(see EDIT below) the later should be divided by two. I am right?`mean - count`

and - Additionally, I have absolutely no idea at all what the
`tol`

parameter means.

The package is downloaded more than 100 times by month and is the basis for RRHO2 publication. Consequently, I am puzzled to not find anything about those issues on the web.

Thanks in advance,

Anthony.

**EDIT:** The if-else construct with `2*the.mean - count`

is equivalent to

```
absval <- abs(count - the.mean)
upper <- the.mean + absval ## same as `2*the.mean - count`
## for `count - the.mean < 0`
```

But I still do not understand, the p-values above one .