Dear users,
I am struggling to understand if my design is correct. I found edgeR section 3.3.1 similar to my situation but I am not that confident.
Here is my experimental design:
samples_table
sampleId cellLine treatment time IC
s1 a vehicle 0 S
s2 a drug 48 S
s3 a drug 168 S
s4 b vehicle 0 S
s5 b drug 48 S
s6 b drug 168 S
s7 c vehicle 0 S
s8 c drug 48 S
s9 c drug 168 S
s10 d vehicle 0 S
s11 d drug 48 S
s12 d drug 168 S
s13 e vehicle 0 R
s14 e drug 48 R
s15 e drug 168 R
s16 f vehicle 0 R
s17 f drug 48 R
s18 f drug 168 R
I have 6 cell lines treated with a drug and RNA sequenced after 48 and 168 hours of treatment. Last column indicates if the cell line is susceptible or resistant to another compound.
I would like to find how resistant cell lines differentially respond to the drug at 48 and/or at 168 hours compared to resistant ones.
Here is my approach:
group <- factor(paste(samples_table$IC, samples_table$time, sep="."))
y <- DGEList(counts.keep, group=group)
y <- calcNormFactors(y)
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
y <- estimateDisp(y, design)
# Quasi-likelihood test
fit <- glmQLFit(y, design)
design
R.0 R.168 R.48 S.0 S.168 S.48
1 0 0 0 1 0 0
2 0 0 0 0 0 1
3 0 0 0 0 1 0
4 0 0 0 1 0 0
5 0 0 0 0 0 1
6 0 0 0 0 1 0
7 0 0 0 1 0 0
8 0 0 0 0 0 1
9 0 0 0 0 1 0
10 1 0 0 0 0 0
11 0 0 1 0 0 0
12 0 1 0 0 0 0
13 1 0 0 0 0 0
14 0 0 1 0 0 0
15 0 1 0 0 0 0
attr(,"assign")
[1] 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"
my.contrasts <- makeContrasts(
S48 = S.48 - S.0,
S168 = S.168 - S.0,
S168vs48 = S.168 - S.48,
R48 = R.48 - R.0,
R168 = R.168 - R.0,
R168vs48 = R.168 - R.48,
RvsS.0 = R.0 - S.0,
RvsS.48 = (R.48 - R.0) - (S.48 - S.0),
RvsS.168 = (R.168 - R.0) - (S.168 - S.0),
all = (R.48 + R.168 - R.0) - (S.48 + S.168 - S.0),
levels = design
)
# to find genes that differentially respond at 48h between resistant and susceptible cell lines
qlf <- glmQLFTest(fit,
contrast = my.contrasts[ , "RvsS.48"])
topTags(qlf)
What do you think? Shouldn't I account the fact that I have different cell lines as a sort of batch effect?
Thanks a lot
Pietro
PS: cross posted to https://support.bioconductor.org/p/119310/
As you posted on Bioconductor, Aaron or Gordon will likely respond, and their answer will supersede any answer given here. On face value —yes— you should be adjusting for cell-line by, for example, including
cellLinein your design formula. So, something like:Before doing this, I would also just check via a PCA bi-plot to see how
cellLineis distributed across your samplesThanks Kevin,
Tried that, but it throws an "Design matrix not of full rank" error.
You can't include both
cellLineandIC, they're mutually exclusive.Hi Devon,
Realized that. So, what do you suggest to account for
cellLinein my design?Thanks
Remove
ICand perform any relevant grouping of it as needed during a contrast.You mean
?
And then I have
How do I specify that I want time48 only for cell lines
eandfagainst time48 for the other 4 cell lines? ThanksIf you want specific group comparisons like that it's more convenient to make groups like
a_time48,a_time168and use~0 + groupas a model.