Question: DE analysis with edgeR and possibly batch effect
gravatar for seta
7 days ago by
seta1.1k wrote:

Hi all,

I'm using edgeR for recognizing differentially expressed genes between control and treatment groups of mice. Here, vd_s is control and vd_r is treatment at 1 day (vd_s1 and vd_r1) and 7 days (vd_s7 and vd_r7) after birth with two replicates. I used the following code with edgeR:

count <- read.delim ("vd_count.txt", row.names=1)
group <- factor(c(rep("vd_s1",2),rep("vd_r1",2),rep("vd_r7",2), rep("vd_s7",2)))
y <- DGEList (counts=count ,group=group)
keep <- rowSums(cpm(y) > 0.5) >= 2
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
plotMDS(y, col=as.numeric(y$samples$group))

Here (MDS plot) is the MDS plot, I expected two groups of treatment and control; but as you can see, there is a separation between two times of 1 and 7 days, so, vd_s1 and vd_r1 separated from vd_s7 and vd_r7. Could you please let me know if it is a sign of batch effect? If yes, could you please kindly advise me how I can introduce it into R and remove it?

Thanks in advance

ADD COMMENTlink modified 7 days ago by Devon Ryan88k • written 7 days ago by seta1.1k

I do not know - should there be a batch effect based on how you performed your laboratory work? Were you not expecting time to be the main source of variation among the samples? - we are talking about post-birth as the baby adjusts to the real World. What does a PCA bi-plot show?

Even if you want to adjust for what you conceive as a batch effect, you will then 'drown out' (eliminate) most or all of the effect of time

ADD REPLYlink modified 7 days ago • written 7 days ago by Kevin Blighe39k

We don't know about which treatment you applied - and which effects you expect from that. I am not surprised to see that there is a bigger difference between 1- and 7-day old mice though.

ADD REPLYlink written 7 days ago by WouterDeCoster37k
gravatar for Devon Ryan
7 days ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

I see no evidence of a batch effect in your data. That you have different results by day seems biologically expected and should be added to your model.

ADD COMMENTlink written 7 days ago by Devon Ryan88k

Thank you very much for all the comments. Devon, your mean is using something like lrt <- glmLRT(fit, coef=2), yes?

ADD REPLYlink written 7 days ago by seta1.1k

You would need a factorial design: ~day + treatment or similar.

ADD REPLYlink written 7 days ago by Devon Ryan88k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2212 users visited in the last hour