Question: (Closed) Applying linear mixed model for RNA-Seq data
1
8 months ago by
Assa Yeroslaviz1.3k
Munich
Assa Yeroslaviz1.3k wrote:

We have an RNA-Seq data set from mouse with three conditions in triplicates. For better understanding of the reaction, each of the animal was weighted and its urea levels were measured beforehand.

We would like to apply a linear mixed-model to the data-set in order to analyze the influence of weight and urea conc. on the expression intensities of genes in my RNA-Seq experiment.

The `weight.score` and `urea.score` were calculated based on multiple measurements on different days

We're trying to identify genes, for which a correlation between the urea conc., body weight and their expression behavior exits.

Our first try was to apply a Pearson correlation to the `total.score`, we calculated for each of the genes (see script below). But in the review we were asked to do to apply a mixed linear-model to test the influence of the body weight and urea concentration on those expressions values and to examine the degree of contribution of each gene to the outcome score. and here I would like to ask for your help.

We would like to include both the `urea.score` and the `weight.score` in the mixed-model (Or do I need to include the actual values in the model?). I'm not sure how to do so.

Our question is - Is there an association between the body weight and urea concentration and the intensities of the gene expression?

In the model I regards the gene expression as my dependent variable, while `urea.score` and `weight.score` are the random variables. Then fitting the mixed model with these effects would be somthing like that `lmer(Gene.X ~ (1|urea.score) + (1|weight.score), data = mergedCorr.norm)`.

But when running the linear model for urae and body-weight separately I can see a significance, while in the mixed-model this is no longer apparent. Am I setting the parameters correctly? Can I sue these two parameters as random variables, even though I have only three different levels?

I would appreciate any help or suggestions as to how I should apply these here.

thanks Assa

Script used:

``````# prepare data
library(lme4)
library(ggplot2)

Scores.RiboM <- structurelistAnimal.ID = structure(c(5L, 6L, 7L, 4L, 9L, 10L,
2L, 1L, 8L, 3L), .Label = c("C20", "C22", "C24", "CR4.1", "CR4.2",
"CR4.3", "CR4.4", "HP11", "HP12", "HP14"), class = "factor"),
group = c(3L, 3L, 3L, 3L, 2L, 2L, 1L, 1L, 2L, 1L), urea = structure(c(8L,
10L, 3L, 2L, 1L, 4L, 6L, 5L, 9L, 7L), .Label = c("105.57",
"110.71", "125.89", "195.87", "321.43", "329.92", "340.81",
"35.59", "363.37", "94.76"), class = "factor"), weight = structure(c(9L,
6L, 4L, 2L, 8L, 7L, 3L, 5L, 10L, 1L), .Label = c("20", "20.1",
"20.7", "21.9", "22.1", "22.6", "23", "23.3", "23.5", "24"
), class = "factor"), urea.score = c(1L, 1L, 1L, 1L, 1L,
2L, 3L, 4L, 4L, 4L), weight.score = c(1L, 1L, 1L, 2L, 3L,
3L, 3L, 3L, 3L, 4L), total.score = c(2L, 2L, 2L, 3L, 4L,
5L, 6L, 7L, 7L, 8L)), row.names = c(8L, 9L, 10L, 7L, 5L,
6L, 2L, 1L, 4L, 3L), class = "data.frame")

# scale() centers the data (the column mean is subtracted from the values in the column) and then scales it (the centered column values are divided by the columnâ€™s standard deviation).
gene.scaled <- structure(c(1.14326057352769, -0.331767048798829, 2.0634403441122,
0.0974979558410874, -0.477899816335822, 0.634079211640983, -0.487033114306884,
-1.06014756199103, -0.982514529237001, -0.598916014452395), .Dim = c(10L,
1L), .Dimnames = list(c("C20", "C22", "C24", "HP11", "HP12",
"HP14", "CR4.1", "CR4.2", "CR4.3", "CR4.4"), "Gene.X"))

# Correlation
mergedCorr.norm <- merge(Scores.RiboM, gene.scaled, by.x=1, by.y=0)
mergedCorr.norm <- mergedCorr.norm[order(mergedCorr.norm\$total.score, decreasing = FALSE),]
CorValue.pear <- cor.test(x=mergedCorr.norm\$total.score, y=mergedCorr.norm[,8], method = "pearson")
CorValue.spear <- cor.test(x=mergedCorr.norm\$total.score, y=mergedCorr.norm[,8], method = "spearman")

# Plotting
ggplot(mergedCorr.norm, aes(x=total.score, y=mergedCorr.norm[,8], label=mergedCorr.norm\$Animal.ID)) +
geom_point(shape=1) +
geom_smooth(method=lm) +
ggtitle(names(mergedCorr.norm)[4])+
geom_text(size=3, hjust=-0.1, vjust=-0.4,colour="darkgreen")+
annotate("text", x = 8, y = max(mergedCorr.norm[,8]),label=paste("Pear. r= ", format(CorValue.pear\$estimate, digits = 4), "\n Spear. r = ", format(CorValue.spear\$estimate, digits = 4), sep=""), colour=ifelse(CorValue.pear\$estimate>=0.9, "blue","red")) +
theme_bw(base_size = 12, base_family = "")

# linear model
lm.weight <- lm(Gene.X ~ weight.score, data = mergedCorr.norm)
lm.urea <- lm(Gene.X ~ urea.score, data = mergedCorr.norm)

# mixed-model
mixed.Exp <- lmer(Gene.X ~ (1|urea.score) + (1|weight.score), data = mergedCorr.norm)
stargazer(mixed.Exp, type = "text",
digits = 3,
star.cutoffs = c(0.05, 0.01, 0.001),
digit.separator = "")
``````
written 8 months ago by Assa Yeroslaviz1.3k

Hello Assa Yeroslaviz!

We believe that this post does not fit the main topic of this site.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!