Question: how to use ESTIMATE to infer tumor purity and stromal score from RNA-seq data?
0
gravatar for lhaiyan3
20 months ago by
lhaiyan330
United States
lhaiyan330 wrote:

Dear all:

Did anyone use ESTIMATE (http://bioinformatics.mdanderson.org/main/ESTIMATE:Overview) to infer tumor purity and stromal score from RNA-seq before? I am not clear how to use this tool and what is the input file format for this tool? They are just several steps, and i did not figure out how to load my own data to run the program? Thanks very much for your great help.

OvarianCancerExpr <- system.file("extdata", "sample_input.txt", package="estimate") filterCommonGenes(input.f=OvarianCancerExpr, output.f="OV_10412genes.gct", id="GeneSymbol") estimateScore("OV_10412genes.gct", "OV_estimate_score.gct", platform="affymetrix") plotPurity(scores="OV_estimate_score.gct", samples="s516", platform="affymetrix")

best

Haiyan Lei

rna-seq • 2.5k views
ADD COMMENTlink modified 15 months ago by sina.nassiri40 • written 20 months ago by lhaiyan330
1
gravatar for sina.nassiri
15 months ago by
sina.nassiri40
Switzerland/Lausanne
sina.nassiri40 wrote:

The ESTIMATE algorithm (Yoshihara et al. 2013 Nature Communications) is comprised of two steps. In the first step, an enrichment score is calculated using single-sample GSEA (Barbie et al. 2009 Nature). Note that although immune cells are essentially part of the stroma, Yoshihara et al. calculated two enrichment scores. One based on immune-related genes, which they referred to as "immune" score. The other score was calculated based on non-immune genes, which they referred to as "stromal" score. The final ESTIMATE score is the sum of immune and stromal enrichment scores. In the second step, the ESTIMATE enrichment score is converted to tumor purity using the following formula:

Tumour purity = cos (0.6049872018 + 0.0001467884 􏰀 x ESTIMATE score)

where "Tumor purity" represents ABSOLUTE-based tumor purity (ABSOLUTE is another algorithm that computes tumor purity based on somatic DNA copy number alterations), and "ESTIMATE score" represents ESTIMATE enrichment score obtained from TCGA Affymetrix data, as explained above. The key point is that this calibration formula was derived using only Affymetrix data, and therefore cannot be used to convert RNAseq-based ESTIMATE score to tumor purity. That being said, you may still apply the single-sample GSEA algorithm to properly normalized RNAseq data to obtain ESTIMATE enrichment scores, and incorporate them as covariate in your downstream analysis to account for tumor purity.

ADD COMMENTlink modified 15 months ago • written 15 months ago by sina.nassiri40

This does not answer the question.

ADD REPLYlink written 13 months ago by friendshipweekpoem0

"The key point is that this calibration formula was derived using only Affymetrix data, and therefore cannot be used to convert RNAseq-based ESTIMATE score to tumor purity" ... How does this not answer the question?

ADD REPLYlink written 12 months ago by sina.nassiri40

What I need is immune fraction and stromal fraction. Is it possible to derive these from ESTIMATE scores?

ADD REPLYlink written 8 months ago by CY370

I think you can definitely use ESTIMATE with RNA-seq data as this was done by the authors themselves. See the tool's website.

ADD REPLYlink written 5 weeks ago by Martombo2.4k
1

First of all, "as this was done by X" is rarely the right approach to verify assumptions of a computational algorithm. Second of all, ESTIMATE is published and the R code is publicly available for anyone to review. The ESTIMATE R package by default only accepts "affymetrix", "agilent", or "illumina" microarray data as input. Can you feed normalized RNAseq data as input to ESTIMATE? You surely can! ESTIMATE uses single sample GSEA to compute immune and stromal scores; it then adds them up to get ESTIMATE score which one can use for downstream analyses. In fact, this is what is provided on their website for TCGA RNAseq data. However, you can’t apply these scores to their formula to calculate tumor purity as this formula was derived specifically for microarray data.

ADD REPLYlink written 5 days ago by sina.nassiri40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1775 users visited in the last hour