How to pick simulation parameters for the RNASeqReadSimulator
1
0
Entering edit mode
6.7 years ago
iraun ★ 3.8k

Hi there,

I am using the tool RNASeqReadSimulator (http://alumni.cs.ucr.edu/~liw/rnaseqreadsimulator.html), in order to simulate RNA-seq reads. In the fist script (there are 3), it is possible to specify expression values to the genes.

python genexplvprofile.py -h
-e/--lognormal  mu,sigma        Specify the mean and variance of the lognormal distribution used to assign expression levels.  Default -4,4

I'm not very good at statistics and I would like to know, which parameters of mu and sigma should be OK if I want to have all the genes expressed in my simulated data.

Any idea or suggestion..?

Thanks

0
Entering edit mode

When in doubt the defaults are a good start

0
Entering edit mode

Yes, sure. But I was thinking that if I put a HIGH mean value and LOW variance, maybe can I simulate to have all the genes expressed? But I don't know which number is "high" or "low".. and maybe it depends on the number of genes...

1
Entering edit mode
6.7 years ago

Since it says there that the distribution of gene expression level will be generated on a lognormal distribution the best way to evaluate these parameters is to look at the shape of the curve for various parameters:

http://en.wikipedia.org/wiki/Log-normal_distribution

In those plots imagine that the horizontal axis corresponds to the gene expression levels whereas the vertical axis corresponds to the fraction of genes that express at a given level. You can use an online calculator to display what the distributions would look like

http://distributome.org/V3/calc/LogNormalCalculator.html