Is the bioconductor RRHO R package p-value computation for two.sided completely wrong ?
Entering edit mode
3.1 years ago
Anthony • 0

Hi everybody,

I am trying to do some rank-rank hyper-geometric overlap with the RRHO R package. I am using alternative="two.sided".

The log p-value are very strange for multiple reasons (see R/ExpressionAnalysis.R function numericListOverlap):

  • the 0 is not taken care of for the computation of the log. Does not -log(pval + eps) -- eps being a small number -- make a better choice?
  • some p-values are above one. 2*the.mean - count (see EDIT below) and log.pval<- -log( phyper(q=lower+tol, m=a, n=n-a+1, k=b, lower.tail=TRUE) + phyper(q= upper-tol, m=a, n=n-a+1, k=b, lower.tail=FALSE)) are meaningless for me. I think that the former should be replaced by mean - count and (see EDIT below) the later should be divided by two. I am right?
  • Additionally, I have absolutely no idea at all what the tol parameter means.

The package is downloaded more than 100 times by month and is the basis for RRHO2 publication. Consequently, I am puzzled to not find anything about those issues on the web.

Thanks in advance,


EDIT: The if-else construct with 2*the.mean - count is equivalent to

             absval <- abs(count - the.mean)
             upper <- the.mean + absval ## same as `2*the.mean - count`
                                        ## for  `count - the.mean < 0`

But I still do not understand, the p-values above one .

RRHO overlap rank-rank hypergeometric overlap • 1.4k views

Login before adding your answer.

Traffic: 1996 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6