Question

Interpretation of normalized counts

0

Entering edit mode

4.4 years ago

ika ▴ 50

I've been asked to determine which genes in my RNA-Seq experiment are expressed across my experimental groups. This does not refer to differential expression, but any genes that are uniquely expressed / not expressed in the different groups. I expected I'd get this done quite quickly, but I'm unsure about the interpretation of the values.

I assume to check for expression in general, it still makes sense to apply my filtering and normalizations beforehand (TMM & quantile). I'm unsure how to differentiate between expressed / not expressed based on the values I receive, though. The values range from -3.54 to 14.8. I'm unsure if I should consider any value below 0 or any value equal to -3.54 as not expressed.

Would you agree with my approach and how would you differentiate between expressed and not expressed?

RNA-Seq limma voom normalization • 801 views

ADD COMMENT • link updated 4.4 years ago by Devon Ryan 104k • written 4.4 years ago by ika ▴ 50

score 2 · Answer 1 · 2019-12-04

I'm not sure why you're using both TMM and quantile normalization, just pick one.

As to your actual question, any expression value you choose will be almost entirely arbitrary. A somewhat common method is to use zFPKMs (there's a package for this in bioconductor) https://www.ncbi.nlm.nih.gov/pubmed/24215113 . In practice you're unlikely to see nice "expressed and not expressed" distributions, but at least you'll have a plot to pick a random value from.

As an aside, "expressed" is not biologically interesting, the more useful question is what genes are meaningfully expressed. Of course "meaningfully" has only an arbitrary definition at this point.