Does GSVA normalize sampleset?
2
1
Entering edit mode
4 months ago
juara ▴ 40

Hello,

I have a log2 normalized TPM dataset that I would like to do pathway enrichment analysis. I thought GSVA and ssGSEA would output enrichment scores independent of sample set. However, when I subset my dataset and rerun GSVA, I get totally different set of scores. Do you have an idea what that might be? Is there an internal normalization process with gsva() function?

Thanks everyone

gsva ssgsea rna-seq • 387 views
3
Entering edit mode
4 months ago
Elucidata ▴ 240

GSVA is a method that is supposed to work on samples.

• By subsetting, if you mean reducing the genes in the dataset then it will have an effect on the scores as the GSVA method fits a density function to each gene using all samples before using the cumulative density values from the distribution to find the GSVA score for each sample.
• By subsetting, if you mean reducing the samples in the dataset, then it will also have an effect on the scores because the density function fit to each gene will change (same as above).

In short, there is indeed an internal normalization happening within the GSVA function which relies on the complete dataset.

1
Entering edit mode
4 months ago
juara ▴ 40

Hi again,

so I found out that in the case of method = "ssgsea, there is a normalization step across samples and genesets. It can be turned off with ssgsea.norm = FALSE to get scores independent of samples and genesets. However, I still have the same normalization issue with method = gsva.

1
Entering edit mode

Yes, I believe some methods result in further normalisation / transformation. This information can [I believe] be read in the manual pages.

Thank you for posting an answer.