Question: Difficulty replicating likelihood ratio test from RNA-seq paper
gravatar for James Ashmore
4.6 years ago by
James Ashmore3.0k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore3.0k wrote:

I have RNA-seq data from two treatment groups (three replicates each) and I am trying to replicate the likelihood ratio test outlined on page 1516 of this paper by Marioni et al: So far I have been able to fit a Poisson regression model for each gene using the following function and command:

poissonGLM <- function(gene) {
  data <- data.frame(treatment=pdata$treatment, count=gene)
  model <- glm(count ~ treatment, data=data, family=poisson)
  mean <- predict(model.pois, type="response"),
  stdev <- predict(model.pois, type="response",$

results <- apply(edata, 1, poissonGLM)

For each gene, this computes the expected mean and expected variance for each treatment group. However, I'm unsure how to use this model to compute the maximum likelihood estimates both under the null and alternative hypothesis (as described in the paper), and generate a P-value from the Chi-squared distribution.

rna-seq R • 1.6k views
ADD COMMENTlink modified 4.6 years ago by Collin850 • written 4.6 years ago by James Ashmore3.0k
gravatar for Collin
4.6 years ago by
United States
Collin850 wrote:

anova(model, test="Chi") should give you the likelihood ratio test result comparing the effect of treatment vs intercept only model. This is the same as if you had manually fit a glm with treatment and a second glm with intercept only and computed their deviance and compared it to a chi-squared distribution (d.f. = 1 in your case).

ADD COMMENTlink written 4.6 years ago by Collin850

So the intercept only model is what Marioni et al. refer to as the null hypothesis, while the model with the treatment group specified refers to the alternative hypothesis?

glm(count ~ 1, data=data, family=poisson) # null hypothesis
glm(count ~ treatment, data=data, family=poisson) # alternative hypothesis

The P-value is therefore the following?

test <- anova(model, test="Chi")

They go on to estimate fold change by fitting the model under the alternative hypothesis, which would mean calculating the ratio of the expected means I've already computed?

treatment1_mean <- mean[1]
treatment2_mean <- mean[2]
treatment1_mean / treatment2_mean
ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by James Ashmore3.0k

Yes the null hypothesis is the intercept only model. Likelihood ratio tests use the log-likelihood and not the predicted count. That means using the logLik function. This means the statistic is -2*(logLik(model.null) - logLik(model.treatment)) where the difference is subtraction because the log of ratios equal the log of the numerator minus the log of the denominator (a property of logarithms).The p-value is just the probability of observing that chi-squared statistic or greater from a chi-squared distribution with d.f. = 1 in your case. All the above work should give you the same result as the anova function I had mentioned initially. So yes the p-value can be extracted like you did above. Presumably thats what they mean by fold change but I haven't read through that paper before.

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Collin850
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 951 users visited in the last hour