Question

Why is log-intensity instead of log-ratio used to develop gene expression classifiers?

0

Entering edit mode

10.3 years ago

Diwan ▴ 650

Hello,

I am trying to generate a gene signature using data from affy 3' chip. Most of the reported gene expression based classifiers use log-intensities to generate a gene signature. Is there any reason for that? Can we also use log-ratio (such as cancer1-normal,cancer2-normal, etc...) instead of intensity. The gene expression intensity should be analyzed in relative-terms rather than absolute value, is that correct?

Your reply will be really helpful.

Thanks
Diwan

microarray-classifier log-intensity log-ratio • 5.8k views

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.3 years ago by Diwan ▴ 650

Ram · Answer 1 · 2014-07-07

The rationale is that you need to be able to apply the classifier to a single-sample. This typically requires working with an intensity value (most likely from the cancer sample, in the example that you provided).

If the data type warrants an analysis of a log2ratio (such as with aCGH data for copy number calls), then that is OK. However, you have a couple additional considerations to worry about with gene expression data:

Does the normal sample provide useful additional information beyond what you can predict with the tumor sample? In practice, you don't want to run measurements twice if it can be avoided.
What is the biological significance of the normal sample? Can you truly call it an example of an unaffected tissue that is equivalent to the tumor tissue? This is not such a big deal with DNA analysis, but it is important for RNA analysis. For example, I've been knocked for assuming that adjacent tumor is equivalent to unaffected normal tissue from another patient (which shouldn't be used in your classifier. For example, differences could be due to the proportion of epithelial cells rather than because of a pathogenic aberration: http://breast-cancer-research.com/content/12/5/R87