Question: (Closed) Applying linear mixed model for RNA-Seq data
1
gravatar for Assa Yeroslaviz
3 months ago by
Assa Yeroslaviz1.2k
Munich
Assa Yeroslaviz1.2k wrote:

We have an RNA-Seq data set from mouse with three conditions in triplicates. For better understanding of the reaction, each of the animal was weighted and its urea levels were measured beforehand.

We would like to apply a linear mixed-model to the data-set in order to analyze the influence of weight and urea conc. on the expression intensities of genes in my RNA-Seq experiment.

The weight.score and urea.score were calculated based on multiple measurements on different days

We're trying to identify genes, for which a correlation between the urea conc., body weight and their expression behavior exits.

Our first try was to apply a Pearson correlation to the total.score, we calculated for each of the genes (see script below). But in the review we were asked to do to apply a mixed linear-model to test the influence of the body weight and urea concentration on those expressions values and to examine the degree of contribution of each gene to the outcome score. and here I would like to ask for your help.

We would like to include both the urea.score and the weight.score in the mixed-model (Or do I need to include the actual values in the model?). I'm not sure how to do so.

Our question is - Is there an association between the body weight and urea concentration and the intensities of the gene expression?

In the model I regards the gene expression as my dependent variable, while urea.score and weight.score are the random variables. Then fitting the mixed model with these effects would be somthing like that lmer(Gene.X ~ (1|urea.score) + (1|weight.score), data = mergedCorr.norm).

But when running the linear model for urae and body-weight separately I can see a significance, while in the mixed-model this is no longer apparent. Am I setting the parameters correctly? Can I sue these two parameters as random variables, even though I have only three different levels?

I would appreciate any help or suggestions as to how I should apply these here.

thanks Assa


Script used:

# prepare data
library(lme4)
library(ggplot2)

Scores.RiboM <- structurelistAnimal.ID = structure(c(5L, 6L, 7L, 4L, 9L, 10L, 
2L, 1L, 8L, 3L), .Label = c("C20", "C22", "C24", "CR4.1", "CR4.2", 
"CR4.3", "CR4.4", "HP11", "HP12", "HP14"), class = "factor"), 
group = c(3L, 3L, 3L, 3L, 2L, 2L, 1L, 1L, 2L, 1L), urea = structure(c(8L, 
10L, 3L, 2L, 1L, 4L, 6L, 5L, 9L, 7L), .Label = c("105.57", 
 "110.71", "125.89", "195.87", "321.43", "329.92", "340.81", 
 "35.59", "363.37", "94.76"), class = "factor"), weight = structure(c(9L, 
 6L, 4L, 2L, 8L, 7L, 3L, 5L, 10L, 1L), .Label = c("20", "20.1", 
 "20.7", "21.9", "22.1", "22.6", "23", "23.3", "23.5", "24"
 ), class = "factor"), urea.score = c(1L, 1L, 1L, 1L, 1L, 
2L, 3L, 4L, 4L, 4L), weight.score = c(1L, 1L, 1L, 2L, 3L, 
3L, 3L, 3L, 3L, 4L), total.score = c(2L, 2L, 2L, 3L, 4L, 
5L, 6L, 7L, 7L, 8L)), row.names = c(8L, 9L, 10L, 7L, 5L, 
6L, 2L, 1L, 4L, 3L), class = "data.frame")

# scale() centers the data (the column mean is subtracted from the values in the column) and then scales it (the centered column values are divided by the column’s standard deviation).
gene.scaled <- structure(c(1.14326057352769, -0.331767048798829, 2.0634403441122, 
0.0974979558410874, -0.477899816335822, 0.634079211640983, -0.487033114306884, 
-1.06014756199103, -0.982514529237001, -0.598916014452395), .Dim = c(10L, 
1L), .Dimnames = list(c("C20", "C22", "C24", "HP11", "HP12", 
"HP14", "CR4.1", "CR4.2", "CR4.3", "CR4.4"), "Gene.X"))

# Correlation
mergedCorr.norm <- merge(Scores.RiboM, gene.scaled, by.x=1, by.y=0)
mergedCorr.norm <- mergedCorr.norm[order(mergedCorr.norm$total.score, decreasing = FALSE),]
CorValue.pear <- cor.test(x=mergedCorr.norm$total.score, y=mergedCorr.norm[,8], method = "pearson")
CorValue.spear <- cor.test(x=mergedCorr.norm$total.score, y=mergedCorr.norm[,8], method = "spearman")

# Plotting
ggplot(mergedCorr.norm, aes(x=total.score, y=mergedCorr.norm[,8], label=mergedCorr.norm$Animal.ID)) +
        geom_point(shape=1) +
        geom_smooth(method=lm) +
        ggtitle(names(mergedCorr.norm)[4])+
        scale_y_continuous(name="norm. read counts")+
        geom_text(size=3, hjust=-0.1, vjust=-0.4,colour="darkgreen")+ 
        annotate("text", x = 8, y = max(mergedCorr.norm[,8]),label=paste("Pear. r= ", format(CorValue.pear$estimate, digits = 4), "\n Spear. r = ", format(CorValue.spear$estimate, digits = 4), sep=""), colour=ifelse(CorValue.pear$estimate>=0.9, "blue","red")) +
        theme_bw(base_size = 12, base_family = "")

# linear model 
lm.weight <- lm(Gene.X ~ weight.score, data = mergedCorr.norm)
lm.urea <- lm(Gene.X ~ urea.score, data = mergedCorr.norm)

# mixed-model
mixed.Exp <- lmer(Gene.X ~ (1|urea.score) + (1|weight.score), data = mergedCorr.norm)
stargazer(mixed.Exp, type = "text",
    digits = 3,
    star.cutoffs = c(0.05, 0.01, 0.001),
    digit.separator = "")
ADD COMMENTlink written 3 months ago by Assa Yeroslaviz1.2k

Hello Assa Yeroslaviz!

We believe that this post does not fit the main topic of this site.

no answers

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink written 3 months ago by Assa Yeroslaviz1.2k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 957 users visited in the last hour