Question: RNA-Seq counts for all genes?
gravatar for ra381
3.1 years ago by
ra38110 wrote:

I've been doing some RNA-Seq analysis and just wanted to get some opinions from the community. I primarily work on yeast (~6000 genes) and when I have done read alignment (usually with bowtie2) and counting reads in features (with ht-seq or featurecounts) I often find that the vast majority of genes have at least one read mapped. This has got me thinking about how to tell whether a gene is expressed and how other people in the community quantify expression.

I'm not suggesting that every gene with a single read mapped is expressed, and I always include a cutoff to exclude genes with few reads mapped for any differential expression analysis, but it does raise some questions; What value or RPKM or TPM would people use to say a gene is expressed? Is it usual to find counts for the majority of genes when we might expect that only a subset of genes are functioning at any one time? If you only found say 10% of the transcriptome had reads mapped would you be skeptical of the data?

I'd be interested to hear people opinions and happy to be directed to any relevant literature. 

rna-seq • 1.4k views
ADD COMMENTlink written 3.1 years ago by ra38110

what if we drop the dichotomy of expressed / not expressed once and for all?
it seems pretty clear to me that genes can at times show a very marginal expression. the number of counts only makes sense in a comparison with something else (ie a second conditon), in my opinion.

check this question as well

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Martombo2.4k

Thanks. For the most part I'd be happy to not think about it as expressed/not expressed. As you say differential expression or building co-expression networks relies on comparisons or correlations of counts or expression estimates between samples and genes. 

ADD REPLYlink written 3.1 years ago by ra38110

Hear, hear!

Transcription is a biochemical reaction, which is entirely dependent upon the local concentrations of reagents, catalysts, and inhibitors. While some biochemical pathways exhibit cooperativity/ultrasensitivity, they are still probabilistic rather than deterministic (all-or-none). When viewed in this light, marginal/spurious expression is to be expected.

ADD REPLYlink written 3.1 years ago by harold.smith.tarheel4.3k

After re-reading my old comment (somebody just left an upvote), I feel like one of the main point was not mentioned in this discussion: when analysing bulk RNA-seq we're often ignoring cell to cell differences, which can be a major source of variation. In a population of millions of cells there might be a few that express a certain -otherwise silent- gene at reasonable levels. That can lead to its overall levels to be pretty low, but a possible high-fold over-expression of this gene could eventually be meaningful, representing a relative increase in abundance of that cell subtype that expresses it.

ADD REPLYlink written 8 months ago by Martombo2.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2065 users visited in the last hour