4.9 years ago by
Overall, I would expect to see a mostly normal sample distribution if you worked with log2 (RPKM + 0.1) values, except for a peak at the rounding cutoff (which you could fix by removing the genes that almost never varied from that rounding cutoff across the samples, if you wanted).
For a gene-centric distribution, I agree with the other comments: it will vary between genes, and I wouldn't be surprised if it varied depending upon the context of the experiment (for example, depending upon the heterogeneity of the samples).
Maybe it is a bit of a tangent, but I've played around a bit with modeling bimodal gene expression, and I've described my experiences here:
That blog post was influenced by the work I did for this project: