Question

edgeR: likelihood ratio test or quasi-likelihood F-test?

0

Entering edit mode

7.1 years ago

moxu ▴ 510

In edgeR, there are two tests available to choose from: likelihood ratio test (LRT) or quasi-likelihood F-test (QLF). I found the two tests generated very different results (at least when comparing an interaction term with the intercept) when a input categorial factor takes more than two values. The heatmaps are very different too: the QLF result genes can be clustered into a couple of obvious patterns while the LRT result genes look more dissimilar to each other.

Which test do you prefer and why the above heatmap observation?

Thanks!

R rna-seq next-gen • 13k views

ADD COMMENT • link 7.1 years ago by moxu ▴ 510

0

Entering edit mode

If you ask this question on Bioconductor support, you will probably get better answers. Anyway, the edgeR User Guide states:

While the likelihood ratio test is a more obvious choice for inferences with GLMs, the QL F-test is preferred as it reflects the uncertainty in estimating the dispersion for each gene. It provides more robust and reliable error rate control when the number of replicates is small.

This seems to be the general consensus, e.g., see here.

ADD REPLY • link 7.1 years ago by h.mon 35k

0

Entering edit mode

Great! Thanks a lot for the reply. test="F" has already been phased out, 'cause it's not seen in the user's guide any more.

The simple answer is: only use LRT when there is no replicates, otherwise use QLF.

ADD REPLY • link 7.1 years ago by moxu ▴ 510

0

Entering edit mode

But I will have to add, although not sure if it is true: if you have done ERCC normalization, even with replicates, LRT might be more powerful.

The reason I am saying this is that I have ERCC normalized samples, fed into RSEM, and then to edgeR. The p-values obtained through LRT are in general much smaller (e.g. e-300) than the QLF p-values (e.g. e-20), and the top genes found through LRT seem to make more sense (i.e. identified and validated previously with biological experiments). It might be that the ERCC normalization minimized between library variations of gene expression levels.

ADD REPLY • link 7.1 years ago by moxu ▴ 510