I tried to validate whether the batch effects in my rna-seq experiment are real or not. You can find more info regarding the experiment in this post.
Shortly: Because of different time points when preparing the RNA and libraries I assumed that I have 3 potential batch effects in total.
I used svaseq from the sva-package to accomplish this. During the procedure I was wondering about some details.
using the function num.sv led to 10 estimated surrogate variables. Interestingly, svaseq blow up with the following error:
error in density.default(x, adjust = adj) : 'x' contains missing valuesUsing 9 variables worked fine. Has someome an idea what is going on? I checked my matrix and it does not contain any NA's.
Edit (20160705): It seems that the function num.sv() from sva has a bug. I tried it on several experiments and svaseq() always spit out the same error. If I defined the n.sv-argument within svaseq as
num.sv-1, it always works perfectly fine.
If one does not use num.sv and because of that does not supply a number to the n.sv-argument of svaseq, it estimates the number of surrogate variables for its own. Interestingly this differs completely from the num.sv function because in my case I got 6 instead of 10 surrogate variables which confused me a bit. Can someone explain this difference?
Using my own estimated batch effects and the 9 surrogate variables estimated by num.sv (9 because 10 didn't work so far -> as described above) led to the same biggest hit. So both result lists are different but e.g. the top hit is the same and there are also a lot of other overlaps. Thinking very naive I would say this is a good hint, that svaseq found my batch-effects in a proper way and that they are real. I mean my batch effects are factors and the sv's from svaseq are continuous. It is unlikely that both models will give me the same top hit by chance, isn't it? What would you think?
I also tried to associate my batch effects with the estimated ones by svaseq. Unfortunately, the power is to low to get reliable results. I checked some plots (box plots, stripcharts) and here the association between the batch-effects and the SV's is visible but not fully convincing.