I'm trying to understand the step-by-step procedure, rationale and logic behind the deconvolution of gene expression from heterogeneous samples.
I understand how, using reference expression profiles and/or cell-specific gene markers, cell types can be identified and their relative abundances can be estimated.
However, as shown here in Fig 3C-D (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3874291/), an alternative means of deconvolution uses: (i) global gene expression data (ii) cell proportions - i.e. no reference/markers. This approach is further discussed here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699332/.
How is this achieved? How can you obtain cell-specific expression patterns using proportions alone? How do you know which cell type is which? Does it depend on there being variability in cell-type frequencies in each sample? And correlating these changes with changes in each gene?
As an example, if we had three heterogeneous populations containing cell-types A and B: (a) A 25% B 75% (b) A 50% B 50% (c) A 75% B 25%, and three genes X, Y and Z - can anybody talk me through what we'd do to obtain cell-specific expression patterns?