I want to compare the performance of some gene set analysis methods and therefore want to simulate my own expression data to overcome the lack of a gold standard. The simulated data should be a good approximation of real biological data with it's complex characteristics and distributions. Genes should be modeled as known correlated blocks, which than can be identified by gene set analysis methods and detection rates can be estimated.
I found the Umpire R package Link, which looks promising, but an annotation of which gene sets are up and down regulated seems to be missing. Does anybody have experience working with Umpire or know a different tool for this purpose or a paper which describes the workflow to simulate expression data?
With best regards,