Recently I'm working on an R pipeline for RNA-seq analysis. My goal is to create a script that compares my samples to each other and compute the DE.
So far I created my script and its working well. But when I implemented loops which would make my life easier, I noticed that the result are different. To be precise: First I did the comparisons sample pair by pair, than with the loops I loaded in all my data and made the comparison sample pair by pair.
What I noticed is everything goes as it should be, than when I reach the part where I call the
fit <- glmQLFit(dge,dge$design,robust = TRUE)
function, the a little difference shows up:
The first pic shows when ALL the samples loaded in, the second is when only 2 group (6 samples) is loaded in. After that when I want to make a
res <- glmQLFTest(fit,contrast = c(-1, 0,1,0))
it obviously shows different result. Edger uses different calculations based on the loaded samples? Also it makes the DE totally different when I try to visualize it. For the 2 example:
One plot shows a lot of DE genes (this is where only 2 groups (6 samples) were compared), and the other only shows 2 (this is where all the samples processed together).
My code looks the same even if I use 2 group samples or multiple group samples.
What could be the problem? Thank you for your time and help!