Question: how to correct gene expression based on tumor purity
gravatar for liu4gre
3.5 years ago by
United States
liu4gre200 wrote:

Hi all, I am looking at TCGA gene expression data. Also I am interested in tumor purity, which may be inferred by a few tools such as ABSOLUTE and ESTIMATE. Question is how to correct gene expression levels based on these inferred values?

ADD COMMENTlink modified 7 months ago by catechize.2.learn120 • written 3.5 years ago by liu4gre200

Hi have you found an answer to this? For differential expression analysis using RNAseq data, it seems the package DESeq2 has a function that allows you to correct for purity estimates. See this paper:

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Alejandro Jimenez Sanchez120

Hi, I am wondering if you figured out the answer to your question? I have the same question with regards to how to apply the tumor purity value to the gene expression levels?? I was able to calculate the tumor purity for each TCGA case for my cancer of interest. But now I'm unsure as to how to apply it. Please do let me know if you were able to get a better understanding of how to work with tumor purity.

Thank you

ADD REPLYlink written 2.6 years ago by rummy.chowdhury0

I have same question

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Shixiang70
gravatar for Kevin Blighe
24 months ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

You just need to include it as a covariate in your design formula. While not directly modifying your data to adjust for the purity estimate, doing this will adjust the statistical inferences made from that data.



Edit: November 12, 2018:

Some evidence to back this:

"In conclusion, we have shown that the influence of tumour purity on the results of genomic analyses is much stronger than previously appreciated, and ought to be included as a covariate in any future analysis."


It is refuted here, though, and stated that purities estimate should be multiplicative:

"There are some practices to account for purity in differential expression (DE) analysis [46] by adding purities as a covariate in the linear model. As we will show, the purity should have a multiplicative effect instead of an additive effect."


ADD COMMENTlink modified 22 months ago • written 24 months ago by Kevin Blighe65k

But if the the purity value is numeric, how to add it into design formula? To classify samples as low, median and high?

Is there a method to adjust the gene expression by tumor purity?

ADD REPLYlink written 23 months ago by Chun-Jie Liu270

A covariate can be categorical or numeric. I am not aware of a program that directly adjusts for tumor purity (but one likely exists... somewhere).

ADD REPLYlink written 23 months ago by Kevin Blighe65k
gravatar for catechize.2.learn
7 months ago by
catechize.2.learn120 wrote:


You may look at this article:

contamDE-lm: linear model-based differential gene expression analysis using next-generation RNA-seq data from contaminated tumor samples

ADD COMMENTlink written 7 months ago by catechize.2.learn120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1308 users visited in the last hour