I am looking for suggestions on what I could investigate to find out what causes differences in R function outputs when running identical code on identical input data with identical software versions on two different machine. Specifically, I am looking for general factors that are not specific to the code of the function I ran.
I was running the R function (sctransform::vst) which runs a variance stabilizing transformation and then reports a per-gene residual variance. I run it on two machines. The first one a Macbook Pro with Mojave and the relevant R package installed into a local user library, and the second one a Skylake node using a Ubuntu-based Singularity image in which I installed the R packages via the renv lock file created from the Macbook user library. Version of R is the same as well. Afaict both input data, software versions and code are identical.
Still I get different outputs differing in the decimal place which have impact on downstream analysis that are based on this.
Please throw me buzzwords on what I could check and investigate to make outputs 100% identical, related to rounding and handling of decimals.
What I checked:
- I use
set.seed()before running the function
options()$digitsis 7 on both machine
- I set
options(scipen=999)on both machine
- I disabled BLAS and OpenMP implicit multithreading on the Linux node via RhpcBLASctl package
...and please lets not discuss whether decimal differences are important or not etc, this is not the point here ;-)