Question: Differential analysis with replicates using edgeR
0
gravatar for Vasu
3 months ago by
Vasu320
Vasu320 wrote:

Hi,

I have 8 RNA-Seq samples. Among them 4 are controls and other 4 are treatment. I'm interested in doing differential analysis with edgeR. Following is the column data.

Samples Type
Sample1 Control
Sample2 Control
Sample5 Control
Sample6 Control
Sample7 Treatment
Sample8 Treatment
Sample3 Treatment
Sample4 Treatment

Among the above table Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day.

As you see the replicates were not processed together, there is the batch effect. In this way how I can create the design matrix in edgeR for differential analysis.

ADD COMMENTlink modified 3 months ago by Biostar ♦♦ 20 • written 3 months ago by Vasu320
2
gravatar for b.nota
3 months ago by
b.nota6.4k
Netherlands
b.nota6.4k wrote:

Please read the manual of edgeR, it is very clear written also for beginners.

ADD COMMENTlink written 3 months ago by b.nota6.4k

Could you please tell me which section I should check for this.

ADD REPLYlink written 3 months ago by Vasu320
1

The section about batch effect. But reading from the start is also wise...

ADD REPLYlink modified 3 months ago • written 3 months ago by b.nota6.4k

I had a look into it. This is the first time I'm working with such data. Could you please tell me whether this is right or not.

coldata

Samples Type           Replicates
Sample1 Control          rep1
Sample2 Control          rep1
Sample5 Control          rep2
Sample6 Control          rep2
Sample7 Treatment      rep2
Sample8 Treatment      rep2
Sample3 Treatment       rep1
Sample4 Treatment       rep1

group <- factor(paste0(coldata$Type))

I crated design matrix like following:

design <- model.matrix(~ 0 + group + coldata$Replicates)
colnames(design) <- c("Control","Treatment","Repl")

And the design looks like below:

  Control Treatment Repl
1       1        0    0
2       1        0    0
3       0        1    0
4       0        1    0
5       1        0    1
6       1        0    1
7       0        1    1
8       0        1    1

Then i have used following commands for linear model fit and DEA.

y <- estimateDisp(y, design, robust=TRUE)
fit <- glmQLFit(y, design, robust=TRUE)

contrast.matrix <- makeContrasts(Treatment-Control, levels=design)
contrast.matrix

Do you think this is right?

ADD REPLYlink written 3 months ago by Vasu320

It looks alright, but one question. You call the batch replicate, does that mean they were from the same sample? Is it technical replication?

ADD REPLYlink written 3 months ago by b.nota6.4k

Yes, they were the same sample but RNA extraction is done on the next day.

As mentioned above Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day

All the samples are from the same cell-line.

ADD REPLYlink modified 3 months ago • written 3 months ago by Vasu320

How can you be sure that this 'batch' effect is going to bias your results? - what evidence have you seen? In many experiments, samples are processed on separate days with minimal or no effect on the end results. If we did a time-course experiment, for example, and assumed that time was a batch effect, then we would wipe out the very differences that we wanted to find based on time.

ADD REPLYlink written 3 months ago by Kevin Blighe39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 783 users visited in the last hour