Question: Differential analysis with replicates using edgeR
0
gravatar for Vasu
9 months ago by
Vasu370
Vasu370 wrote:

Hi,

I have 8 RNA-Seq samples. Among them 4 are controls and other 4 are treatment. I'm interested in doing differential analysis with edgeR. Following is the column data.

Samples Type
Sample1 Control
Sample2 Control
Sample5 Control
Sample6 Control
Sample7 Treatment
Sample8 Treatment
Sample3 Treatment
Sample4 Treatment

Among the above table Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day.

As you see the replicates were not processed together, there is the batch effect. In this way how I can create the design matrix in edgeR for differential analysis.

ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 9 months ago by Vasu370
2
gravatar for Benn
9 months ago by
Benn7.7k
Netherlands
Benn7.7k wrote:

Please read the manual of edgeR, it is very clear written also for beginners.

ADD COMMENTlink written 9 months ago by Benn7.7k

Could you please tell me which section I should check for this.

ADD REPLYlink written 9 months ago by Vasu370
1

The section about batch effect. But reading from the start is also wise...

ADD REPLYlink modified 9 months ago • written 9 months ago by Benn7.7k

I had a look into it. This is the first time I'm working with such data. Could you please tell me whether this is right or not.

coldata

Samples Type           Replicates
Sample1 Control          rep1
Sample2 Control          rep1
Sample5 Control          rep2
Sample6 Control          rep2
Sample7 Treatment      rep2
Sample8 Treatment      rep2
Sample3 Treatment       rep1
Sample4 Treatment       rep1

group <- factor(paste0(coldata$Type))

I crated design matrix like following:

design <- model.matrix(~ 0 + group + coldata$Replicates)
colnames(design) <- c("Control","Treatment","Repl")

And the design looks like below:

  Control Treatment Repl
1       1        0    0
2       1        0    0
3       0        1    0
4       0        1    0
5       1        0    1
6       1        0    1
7       0        1    1
8       0        1    1

Then i have used following commands for linear model fit and DEA.

y <- estimateDisp(y, design, robust=TRUE)
fit <- glmQLFit(y, design, robust=TRUE)

contrast.matrix <- makeContrasts(Treatment-Control, levels=design)
contrast.matrix

Do you think this is right?

ADD REPLYlink written 9 months ago by Vasu370

It looks alright, but one question. You call the batch replicate, does that mean they were from the same sample? Is it technical replication?

ADD REPLYlink written 9 months ago by Benn7.7k

Yes, they were the same sample but RNA extraction is done on the next day.

As mentioned above Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day

All the samples are from the same cell-line.

ADD REPLYlink modified 9 months ago • written 9 months ago by Vasu370

How can you be sure that this 'batch' effect is going to bias your results? - what evidence have you seen? In many experiments, samples are processed on separate days with minimal or no effect on the end results. If we did a time-course experiment, for example, and assumed that time was a batch effect, then we would wipe out the very differences that we wanted to find based on time.

ADD REPLYlink written 8 months ago by Kevin Blighe48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1702 users visited in the last hour