I'm using edgeR for recognizing differentially expressed genes between control and treatment groups of mice. Here, vd_s is control and vd_r is treatment at 1 day (vd_s1 and vd_r1) and 7 days (vd_s7 and vd_r7) after birth with two replicates. I used the following code with edgeR:
count <- read.delim ("vd_count.txt", row.names=1) group <- factor(c(rep("vd_s1",2),rep("vd_r1",2),rep("vd_r7",2), rep("vd_s7",2))) y <- DGEList (counts=count ,group=group) keep <- rowSums(cpm(y) > 0.5) >= 2 y <- y[keep, , keep.lib.sizes=FALSE] y <- calcNormFactors(y) plotMDS(y, col=as.numeric(y$samples$group))
Here () is the MDS plot, I expected two groups of treatment and control; but as you can see, there is a separation between two times of 1 and 7 days, so, vd_s1 and vd_r1 separated from vd_s7 and vd_r7. Could you please let me know if it is a sign of batch effect? If yes, could you please kindly advise me how I can introduce it into R and remove it?
Thanks in advance