I've been doing some RNA-Seq analysis and just wanted to get some opinions from the community. I primarily work on yeast (~6000 genes) and when I have done read alignment (usually with bowtie2) and counting reads in features (with ht-seq or featurecounts) I often find that the vast majority of genes have at least one read mapped. This has got me thinking about how to tell whether a gene is expressed and how other people in the community quantify expression.
I'm not suggesting that every gene with a single read mapped is expressed, and I always include a cutoff to exclude genes with few reads mapped for any differential expression analysis, but it does raise some questions; What value or RPKM or TPM would people use to say a gene is expressed? Is it usual to find counts for the majority of genes when we might expect that only a subset of genes are functioning at any one time? If you only found say 10% of the transcriptome had reads mapped would you be skeptical of the data?
I'd be interested to hear people opinions and happy to be directed to any relevant literature.