GSEA vs GO enrichment
1
1
Entering edit mode
9 weeks ago
HarperReed ▴ 10

Hello, I would be very grateful if someone could explain the differences between Gene Ontology (GO), Gene Set Enrichment Analysis (GSEA), and pathway enrichment (using KEGG). I have read several publications but still find it difficult to fully understand how they differ.

Additionally, could you please recommend some tools or general-purpose packages that can perform GO and GSEA analyses across different organisms, including microorganisms, humans, and plants?

go kegg gsea • 14k views
ADD COMMENT
5
Entering edit mode
9 weeks ago
ATpoint 89k

The question at hand is "what is GSEA versus overrepresentation analysis". Both, GO, KEGG, and others such as REACTOME, PathwayCommons and WikiPathways are just databases that store annotations, which is groups of genes with a certain name. Granularity and redundance is different between databases but all are "terms", "pathways", or whatever you want to call them.

Now for the question:

Overrepresentation (ORA) analysis (for example what many websites like GO offer you)

You provide a list of genes and the statistics check whether there are pathways that intersect with the provided genes more than expected by chance. The test is usually a hypergeometric (or flavours of it) one. Some tools take a background, such as all tested genes to correct for pre-enrichment of transcriptomes (like, an immune cell is pre-enriched for immune genes, a hepatocyte is pre-enriched for metabolic genes...) and incorporiate that into the p-value.

GSEA

Geneset enrichment analysis take usually all tested genes of an analysis (in a pre-ranked fashion), or a count matrix and then ask whether the DE statistics of the pathway genes are more extreme than the trended distribution of all genes. In the fGSEA package from Bioconductor this is done by ranking genes by a user-provided metric and then test the rank distribution using a gene permutation strategy to get pvalues. The limma package implements such competitive geneset tests via essentially (and I hope I get the concept correctly) by essentially performing a t-test between the DE statistics (I think it's the t-stats, but could be wrong) of the pathway genes compared to the t-stats of all genes, with some additional magic under the hood to account for gene correlation. In any case, it does not ask "are these DE genes from my analysis enriched for something" but it interrogates essentially the shifts in DE trends for enrichments.

ADD COMMENT
0
Entering edit mode

thank you very much for this detailed answer !! Now it's clear in my mind !! Do you have any recommandation for a general-purpose package that can perform GSEA analyses across different organisms (microorganisms, humans, and plants etc)

ADD REPLY
1
Entering edit mode

I use camera from limma, or something fGSEA. Both Bioconductor. It has nothing to do with species, it's just the stats framework. The species solely depend on the database for your annotations. I use REACTOME, no idea what to use for bugs and plants.

ADD REPLY

Login before adding your answer.

Traffic: 4372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6