Question: Checking potential batch effects with svaseq
gravatar for Tobias.Wohland
4.6 years ago by
Tobias.Wohland70 wrote:

I tried to validate whether the batch effects in my rna-seq experiment are real or not. You can find more info regarding the experiment in this post.

Shortly: Because of different time points when preparing the RNA and libraries I assumed that I have 3 potential batch effects in total.

I used svaseq from the sva-package to accomplish this. During the procedure I was wondering about some details.

  1. using the function led to 10 estimated surrogate variables. Interestingly, svaseq blow up with the following error: error in density.default(x, adjust = adj) : 'x' contains missing values Using 9 variables worked fine. Has someome an idea what is going on? I checked my matrix and it does not contain any NA's.

    Edit (20160705): It seems that the function from sva has a bug. I tried it on several experiments and svaseq() always spit out the same error. If I defined the within svaseq as, it always works perfectly fine.

  2. If one does not use and because of that does not supply a number to the of svaseq, it estimates the number of surrogate variables for its own. Interestingly this differs completely from the function because in my case I got 6 instead of 10 surrogate variables which confused me a bit. Can someone explain this difference?

Using my own estimated batch effects and the 9 surrogate variables estimated by (9 because 10 didn't work so far -> as described above) led to the same biggest hit. So both result lists are different but e.g. the top hit is the same and there are also a lot of other overlaps. Thinking very naive I would say this is a good hint, that svaseq found my batch-effects in a proper way and that they are real. I mean my batch effects are factors and the sv's from svaseq are continuous. It is unlikely that both models will give me the same top hit by chance, isn't it? What would you think?

I also tried to associate my batch effects with the estimated ones by svaseq. Unfortunately, the power is to low to get reliable results. I checked some plots (box plots, stripcharts) and here the association between the batch-effects and the SV's is visible but not fully convincing.

rna-seq R • 3.9k views
ADD COMMENTlink modified 23 months ago by bookerbn0 • written 4.6 years ago by Tobias.Wohland70

Hi Tobias.Wohland,

I am facing the same problem and I am not sure how to proceed. I tried your way,, but it did not work again. When I am calculating using "leek" method, the number of surrogate variables were 84. with the method "be", the number of surrogate variables were 1.

can you please help me get it working? Thank you Amol

ADD REPLYlink written 4.2 years ago by amoltej90

I suggests you add the code you used to reach the errors. Otherwise anyone trying to help will find very difficult to figure out what might be going wrong. It is possible that the function you are using has indeed a bug, but it is also possible that you are not using it right. Without the code it is impossible to tell.

ADD REPLYlink written 4.2 years ago by ddiez1.9k

Post your command and sessionInfo.

ADD REPLYlink written 4.2 years ago by GZ1995400
gravatar for bookerbn
23 months ago by
bookerbn0 wrote:

I encounter a similar error when working with sva and smartsva.cpp and I found that all my categorical variables read in as characters. Once I changed them all to factors the function worked without issues. I hope this helps! Cheers!

ADD COMMENTlink written 23 months ago by bookerbn0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1379 users visited in the last hour