Question: Differential expression - design matrix with multiple factors and paired samples
gravatar for ivanarg2
12 months ago by
ivanarg240 wrote:

I'm analyzing gene expression data from an experiment where I have (i) 6 different mice (A:D); (ii) 2 different time points ("pre" and "post" treatment); and (iii) 2 different response types ("responder" vs. "non-responder"). I'm trying to find genes with differential expression after treatment in the responders vs. non-responders. In other words: if I look at post therapy gene expression changes compared to 'pre', which genes are differentially expressed in responders compared to non-responders? Additionally, how can I include the 'paired' aspect into my analysis, where I might get more meaningful information by running the above analysis in a paired, rather than general, manner?

I've spent an entire day reading tutorials on designing design matrices for differential expression but I can't pin down the details, I'm hoping someone with experience might be able to help out.

A small reproducible example:

ms = factor(rep(c("A", "B", "C", "D", "E", "F"), each = 2)) # Mouse ID
rx = factor(rep(c("Pre", "Post"), 6)) # Pre or Post treatment
rp = factor(c(rep("NR", 6), rep("R", 6))) # Is the mouse a 'responder' or 'non-responder'?
data.frame(ms, rx, rp)
   ms   rx rp
1   A  Pre NR
2   A Post NR
3   B  Pre NR
4   B Post NR
5   C  Pre NR
6   C Post NR
7   D  Pre  R
8   D Post  R
9   E  Pre  R
10  E Post  R
11  F  Pre  R
12  F Post  R

The ideal analysis that I'd like to do is something like the following, but including the paired design, which would hopefully make the analysis more powerful:

(Post / Pre in responders) / (Post / Pre in non-responders)

The first term ("Post / Pre in responders") would find DE genes in responders, and by dividing by the second term ("Post / Pre in non-responders") I would ideally get genes which are diferentially expressed only in responders.

I tried running the "Post / Pre" analysis in responders and in non-responders separately, and manually dividing, as in the code above, but it doesn't feel right, and there must be a true way of running that with a properly designed matrix.

rna-seq limma • 468 views
ADD COMMENTlink modified 12 months ago by Devon Ryan90k • written 12 months ago by ivanarg240
gravatar for Devon Ryan
12 months ago by
Devon Ryan90k
Freiburg, Germany
Devon Ryan90k wrote:

You'll want a design of ~ms + rx*rp and the coefficient of primary interest is then the rx:rp interaction. That then accounts for the pairing and gives you the interaction you want. Note that you'll probably get an error about the matrix being of insufficient rank, in which case you should use ms = factor(c(rep(rep(c("A","B","C"), each=2), 2)). This will keep the pairing, but change things such that each of the groups contains a mouse with the same ID (otherwise correcting for the pairing will confound fitting the response group).

ADD COMMENTlink written 12 months ago by Devon Ryan90k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 864 users visited in the last hour