I'm interested in simulating RNA-seq data using the rlsim tool. However, rather than randomly draw the expression levels from a mixture of distributions, I wish to control which transcripts are expressed at relative levels. Basically, I have an input distribution of relative abundances, and I want to simulate an RNA-seq dataset where these relative abundances are the underlying (latent) model parameters. Is this possible in rlsim? I wasn't able to find a way to do this by looking at the user manual. If it is possible, how would I accomplish it?
rlsim cannot simulate from relative abundances (proportions) only, but you can easily "convert "the proportions to the absolute levels by multiplying them by a constant which is large enough for your purposes and then construct the input Fasta using those levels. You might have to play around with the multiplier using the
-m flag, make sure that the "sampling ratio" reported by
rlsim is small (let's say at least as small as 10^-5).