Question: Experimental design RNAseq differential expression
0
gravatar for Pietro
15 months ago by
Pietro100
Italy
Pietro100 wrote:

Dear users,

I am struggling to understand if my design is correct. I found edgeR section 3.3.1 similar to my situation but I am not that confident.

Here is my experimental design:

samples_table

sampleId    cellLine    treatment   time        IC
s1                 a      vehicle      0         S
s2                 a         drug     48         S
s3                 a         drug    168         S
s4                 b      vehicle      0         S
s5                 b         drug     48         S
s6                 b         drug    168         S
s7                 c      vehicle      0         S
s8                 c         drug     48         S
s9                 c         drug    168         S
s10                d      vehicle      0         S
s11                d         drug     48         S
s12                d         drug    168         S
s13                e      vehicle      0         R
s14                e         drug     48         R
s15                e         drug    168         R
s16                f      vehicle      0         R
s17                f         drug     48         R
s18                f         drug    168         R

I have 6 cell lines treated with a drug and RNA sequenced after 48 and 168 hours of treatment. Last column indicates if the cell line is susceptible or resistant to another compound.

I would like to find how resistant cell lines differentially respond to the drug at 48 and/or at 168 hours compared to resistant ones.

Here is my approach:

group <- factor(paste(samples_table$IC, samples_table$time, sep="."))
y <- DGEList(counts.keep, group=group)
y <- calcNormFactors(y)
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
y <- estimateDisp(y, design)

# Quasi-likelihood test
fit <- glmQLFit(y, design)

design

   R.0 R.168 R.48 S.0 S.168 S.48
1    0     0    0   1     0    0
2    0     0    0   0     0    1
3    0     0    0   0     1    0
4    0     0    0   1     0    0
5    0     0    0   0     0    1
6    0     0    0   0     1    0
7    0     0    0   1     0    0
8    0     0    0   0     0    1
9    0     0    0   0     1    0
10   1     0    0   0     0    0
11   0     0    1   0     0    0
12   0     1    0   0     0    0
13   1     0    0   0     0    0
14   0     0    1   0     0    0
15   0     1    0   0     0    0
attr(,"assign")
[1] 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"

my.contrasts <- makeContrasts(
  S48 = S.48 - S.0,
  S168 = S.168 - S.0,
  S168vs48 = S.168 - S.48,
  R48 = R.48 - R.0,
  R168 = R.168 - R.0,
  R168vs48 = R.168 - R.48,
  RvsS.0 = R.0 - S.0,
  RvsS.48 = (R.48 - R.0) - (S.48 - S.0),
  RvsS.168 = (R.168 - R.0) - (S.168 - S.0),
  all = (R.48 + R.168 - R.0) - (S.48 + S.168 - S.0),
  levels = design
)

# to find genes that differentially respond at 48h between resistant and susceptible cell lines

qlf <- glmQLFTest(fit, 
                  contrast = my.contrasts[ , "RvsS.48"])

topTags(qlf)

What do you think? Shouldn't I account the fact that I have different cell lines as a sort of batch effect?

Thanks a lot

Pietro

PS: cross posted to https://support.bioconductor.org/p/119310/

ADD COMMENTlink modified 15 months ago by Biostar ♦♦ 20 • written 15 months ago by Pietro100

As you posted on Bioconductor, Aaron or Gordon will likely respond, and their answer will supersede any answer given here. On face value —yes— you should be adjusting for cell-line by, for example, including cellLine in your design formula. So, something like:

~ cellLine + group

Before doing this, I would also just check via a PCA bi-plot to see how cellLine is distributed across your samples

ADD REPLYlink modified 15 months ago • written 15 months ago by Kevin Blighe61k

Thanks Kevin,

Tried that, but it throws an "Design matrix not of full rank" error.

ADD REPLYlink written 15 months ago by Pietro100

You can't include both cellLine and IC, they're mutually exclusive.

ADD REPLYlink written 15 months ago by Devon Ryan95k

Hi Devon,

Realized that. So, what do you suggest to account for cellLine in my design?

Thanks

ADD REPLYlink modified 15 months ago • written 15 months ago by Pietro100

Remove IC and perform any relevant grouping of it as needed during a contrast.

ADD REPLYlink written 15 months ago by Devon Ryan95k

You mean

design <- model.matrix(~ 0 + cellLine + time, 
                       levels = samples_table)

?

And then I have

     a   b    c    d   e     f time48 time168
1    1   0    0    0   0     0      0       0
2    1   0    0    0   0     0      1       0
3    1   0    0    0   0     0      0       1
4    0   1    0    0   0     0      0       0
5    0   1    0    0   0     0      1       0
6    0   1    0    0   0     0      0       1
7    0   0    1    0   0     0      0       0
8    0   0    1    0   0     0      1       0
9    0   0    1    0   0     0      0       1
10   0   0    0    1   0     0      0       0
11   0   0    0    1   0     0      1       0
12   0   0    0    1   0     0      0       1
13   0   0    0    0   1     0      0       0
14   0   0    0    0   1     0      1       0
15   0   0    0    0   1     0      0       1
16   0   0    0    0   0     1      0       0
17   0   0    0    0   0     1      1       0
18   0   0    0    0   0     1      0       1

How do I specify that I want time48 only for cell lines e and f against time48 for the other 4 cell lines? Thanks

ADD REPLYlink modified 15 months ago • written 15 months ago by Pietro100

If you want specific group comparisons like that it's more convenient to make groups like a_time48, a_time168 and use ~0 + group as a model.

ADD REPLYlink written 15 months ago by Devon Ryan95k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1590 users visited in the last hour