I am interested in extracting the genes from a dataset which have a high dynamic range in terms of their expression.
This is specifically for using the Monocle package for single-cell RNA-seq analysis. The author suggests reducing the gene list to those genes that a) are detected in a sufficient amount of cells, and b) vary over a sufficiently large dynamic range.
My idea of a sufficiently large dynamic range would be those genes that are above a certain threshold in a coefficient of variation plot (standard deviation / mean ), i.e., those genes which vary above that which is expected based on their mean expression across all samples.
However, I don't know how to obtain this expectation in order to define which genes deviate from it. I'm sure this is a common task in RNA-seq... any ideas?