Hi all, I am looking at TCGA gene expression data. Also I am interested in tumor purity, which may be inferred by a few tools such as ABSOLUTE and ESTIMATE. Question is how to correct gene expression levels based on these inferred values?
You just need to include it as a covariate in your design formula. While not directly modifying your data to adjust for the purity estimate, doing this will adjust the statistical inferences made from that data.
Edit: November 12, 2018:
Some evidence to back this:
"In conclusion, we have shown that the influence of tumour purity on the results of genomic analyses is much stronger than previously appreciated, and ought to be included as a covariate in any future analysis."
It is refuted here, though, and stated that purities estimate should be multiplicative:
"There are some practices to account for purity in differential expression (DE) analysis  by adding purities as a covariate in the linear model. As we will show, the purity should have a multiplicative effect instead of an additive effect."