Question: Clarity for proper design model matrix with paired sample analysis
0
gravatar for reskejak
8 months ago by
reskejak20
Michigan State University
reskejak20 wrote:

I have done a significant amount of reading over the past few days and am still uncertain about proper model matrix design for paired sample analysis. I am starting with a limma/voom object v that I would like to assess for transcriptional differences via RNA-seq between disease subtypes and their respective matched normal tissues. Thus I would like to model the effects of both Tissue and Subject.

Regarding the model matrix, I would assume to use design <- model.matrix(~0+Subject+Tissue). My understanding is that this would consider an intercept for Tissue, where Normal is always the reference, but there would be no reference Subject (due to no intercept for subject).

Note this approach elicits slightly different results than design <- model.matrix(~0+Tissue+Subject). The other alternative I have considered is design <- model.matrix(~Tissue+Subject) which considers an intercept for both Tissue and Subject. However, I don't want a reference Subject, so I think this design is not appropriate.

Here's my workflow in R:

# consider limma/voom object v

# make paired design matrix
exp.design <- data.frame(c("S1.Disease1", "S1.Normal", 
                           "S2.Disease1", "S2.Normal", 
                           "S3.Disease1", "S3.Disease2", "S3.Normal",
                           "S4.Disease1", "S4.Disease2", "S4.Normal",
                           "S5.Disease1", "S5.Disease2", "S5.Normal",
                           "S6.Disease1", "S6.Normal",
                           "S7.Disease1", "S7.Normal",
                           "S8.Disease1", "S8.Normal",
                           "S9.Disease1", "S9.Normal"))
colnames(exp.design) <- "Sample"
exp.design$subject <- c(rep("S1", 2), 
                        rep("S2", 2), 
                        rep("S3", 3), 
                        rep("S4", 3), 
                        rep("S5", 3), 
                        rep("S6", 2), 
                        rep("S7", 2), 
                        rep("S8", 2), 
                        rep("S9", 2))
exp.design$tissue <- c('Disease.subtype1', 'Normal',                     # S1
                       'Disease.subtype1', 'Normal',                     # S2
                       'Disease.subtype3', 'Disease.subtype1', 'Normal', # S3
                       'Disease.subtype3', 'Disease.subtype3', 'Normal', # S4
                       'Disease.subtype1', 'Disease.subtype1', 'Normal', # S5
                       'Disease.subtype2', 'Normal',                     # S6
                       'Disease.subtype2', 'Normal',                     # S7
                       'Disease.subtype1', 'Normal',                     # S8
                       'Disease.subtype1', 'Normal'                      # S9
                       )
Subject <- factor(exp.design$subject)
Tissue <- factor(exp.design$tissue, levels=c("Normal", "Disease.subtype1", "Disease.subtype2", "Disease.subtype3"))

# design model matrix
design <- model.matrix(~0+Subject+Tissue)
v$design <- design

# contrast matrix
contr.matrix <- makeContrasts(DiseaseSubtype1vsNormal = TissueDisease.subtype1,
                              DiseaseSubtype2vsNormal = TissueDisease.subtype2,
                              DiseaseSubtype3vsNormal = TissueDisease.subtype3,
                              levels = colnames(design))

# fit linear models etc.
vfit <- lmFit(v, v$design)
vfit <- contrasts.fit(vfit, contrasts=contr.matrix)
efit <- eBayes(vfit)

# prepare DGE lists
DiesaseSubtype1.vs.Normal <- topTable(efit, coef=1, n=Inf)
DiesaseSubtype2.vs.Normal <- topTable(efit, coef=2, n=Inf)
DiesaseSubtype3.vs.Normal <- topTable(efit, coef=3, n=Inf)

# subset by FDR < 0.05 genes
DiesaseSubtype1.vs.Normal.sig <- subset(DiesaseSubtype1.vs.Normal, adj.P.Val < 0.05)
DiesaseSubtype2.vs.Normal.sig <- subset(DiesaseSubtype2.vs.Normal, adj.P.Val < 0.05)
DiesaseSubtype3.vs.Normal.sig <- subset(DiesaseSubtype3.vs.Normal, adj.P.Val < 0.05)

I would much appreciate any insight. Thanks!

statistics rna-seq limma design R • 373 views
ADD COMMENTlink written 8 months ago by reskejak20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 702 users visited in the last hour