Question: Looking for a Viable Design Matrix for Differential Gene Expression Using Paired Sample
gravatar for bdighera
10 months ago by
Fullerton, CA
bdighera0 wrote:

Firstly, let me start by saying that I am relatively new to differential expression analysis so please bear with me. I have read over the Limma user guide countless times, and have looked at previous posts regarding similar questions, however I am not convinced that my design matrix truly captures the intention of my study.

My exploratory study consists of an Affymetrix U133a microarray dataset consisting of eight subjects pre/post sleep deprivation (16 total data points). Two psychological evaluations were administered during these time points (SSS and PVT) which I want to use as continuous response variables to represent patient sleep deprivation. My end goal is to determine which differentially expressed genes respond to sleep deprivation in my paired sample.

Does anyone have any thoughts about whether my design matrix will yield genes only found in responders (Those with higher PVT/SSS scores)? What would be my primary coefficient of interest? Additionally, is there any way that I could remove the effects of gender on my design matrix?

This my data frame which I am using to construct the design matrix:

patient <- factor(rep(c(1,2,3,4,5,6,7,8), each=2)) #patient ID
condition <- factor(rep(c('Post', 'Pre'), 8)) #Pre or Post Treatment
gender <- factor(c(rep('F', 8), rep('M', 8))) #gender
PVT <- c(339.67,254.56,423.33,...) #Response Variable 1
SSS <- c(6,2,3,1,3,2,5,2,5,1,3,2,2,1,5,3) #Response Variable 2
data.frame(patient, condition, gender, PVT, SSS)

   patient condition gender    PVT SSS
1        1      Post     F 339.67   6
2        1      Pre      F 254.56   2
3        2      Post     F 423.33   3
4        2      Pre      F 316.09   1
5        3      Post     F 640.13   3
6        3      Pre      F 358.82   2
7        4      Post     F 321.15   5
8        4      Pre      F 491.67   2
9        5      Post     M 338.99   5
10       5      Pre      M 288.09   1
11       6      Post     M 261.96   3
12       6      Pre      M 246.69   2
13       7      Post     M 276.48   2
14       7      Pre      M 250.11   1
15       8      Post     M 267.14   5
16       8      Pre      M 249.67   3

This is my proposed design matrix:

design <- model.matrix(~patient + condition*PVT+SSS)

Any input would be greatly appreciated. Thank you in advance.

limma bioconductor R • 532 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by bdighera0
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe61k
University College London
Kevin Blighe61k wrote:

Given the low sample numbers, I would aim to keep this as simple as possible, and I think that there are multiple ways to do this. I would start with, for example:

 - ~ patient + gender + condition * PVT
 - ~ patient + gender + condition * SSS

This assumes that PVT (Psychomotor Vigilance Task) and SSS (Stanford Sleepiness Scale) are independent evaluations (response variables) and can be tested independently - correct? As such, they are not quite covariates and are actually the variables under study?

This formula (above) will also account for the patient pairing, and also gender.


You could also stratify by gender and conduct separate analyses for the male and female groups:


 - ~ patient + condition * PVT
 - ~ patient + condition * SSS


 - ~ patient + condition * PVT
 - ~ patient + condition * SSS

Also take a look through the Bioconductor support forum threads for other ideas.


ADD COMMENTlink modified 10 months ago • written 10 months ago by Kevin Blighe61k

Kevin, thank you so much for your response. It is exactly what I need to move forward in my analysis. You are correct, both PVT and SSS are independent evaluations and therefore can be tested independently. Do you think that stratifying by gender would significantly compromise the strength of the analysis with respect to the designs which include both genders due to the reduced sample size?

ADD REPLYlink written 10 months ago by bdighera0

I am not sure that it will be problematic to segregate based on gender. The total sample number is already not too high. Is sex / gender a known confounding factor in these types of studies?

ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe61k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1655 users visited in the last hour