Question: ssGSEA on RNA-Seq data from TCGA
gravatar for oriolebaltimore
3.5 years ago by
United States
oriolebaltimore110 wrote:


Given TCGA RNA-Seq data  (level 3), how one can do single sample GSEA?

For example, TCGA RNA-Seq data has

Gene     Raw counts     RPKM

ALK       434                 2.3

.....          ...                  .....


Can we create a GCT file of Raw counts or RPKM to ssGSEA in gene pattern?



rna-seq gene • 4.9k views
ADD COMMENTlink modified 17 months ago by fuyingxue10 • written 3.5 years ago by oriolebaltimore110


ssGSEA in comparison with GSEA calculates separate enrichment scores for each pairing of a sample and gene set. However, both need the .GCT files as an input. From the TCGA you need to download the level 3 data, however it has to be expression. So, after some preprocessing your .GCT should look like (


no. of rows       no. of columns

NAME Description Sample1 Sample2 ...

gene1 NA               exp. value

gene2 NA


I think that there is no possibility to calculate those statistics from raw counts and rpkm. Raw counts may be biased, so I think that you should consider some normalisation step for such analysis to make it comparable with other studies. 

Best regards!

ADD REPLYlink written 3.5 years ago by orzech_mag200
gravatar for MMa
20 months ago by
MMa270 wrote:

I know this is an old thread, but SSGSEA can be calculated using the Bioconductor package GSVA. If you use RPKM, use ssgsea <- gsva (RPKM, method="ssgsea", kcdf="Gaussian", ...); if you use raw counts, use ssgsea <- gsva (counts, method="ssgsea", kcdf="Poisson", ...)

ADD COMMENTlink written 20 months ago by MMa270

I think kcdf parameter is only applicable when method = "gsva". According to the docs:

Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when method="gsva".

ADD REPLYlink modified 13 months ago • written 13 months ago by igor8.0k
gravatar for fuyingxue
17 months ago by
fuyingxue10 wrote:

Could anyone explain how ssGSEA process the gene expression data to rank the gene for each patient sample? Dose it need to be compared with the normal sample?

ADD COMMENTlink written 17 months ago by fuyingxue10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 696 users visited in the last hour