Question

GSEA related to other functional analyses

1

Entering edit mode

6.7 years ago

sup230 ▴ 60

Hi,

I need clear explanation on how GSEA is related to other functional analyses in general. I understand that it tests whether an a priori defined gene sets appear significantly different between two phenotypes, but how is this different from GO over-representation test or KEGG pathway analysis? I remember reading something about the latter two needs a pre-specified threshold whereas some analysis does not require any prior statistical threshold but only looks at relative difference between phenotype groups.

I have done DE gene analysis using DESeq2 package to get significant gene list between two groups of phenotype (in my case, it is seizure history of yes/no, from 475 total samples and involving about 30k genes). From the significant gene list, I have done GO term over-representation test (using gprofiler, and cluster profiler) and also used GAGE package to find some KEGG pathways with significant p-adjusted values.

On the GSEA website I read that Molecular Signature Database is divided into 8 major collections and sub-collections. Is there hierarchy of these collection or are there any overlaps between these collection? To be more specific, will I get different results from significant KEGG pathways by that I would from different source?

If I end up using GSEA software on desktop or Java, should I use normalized count data or raw count data? Has anyone done GSEA from RNA-seq data with a dimension as mine (475 samples and 30k genes)? If so, any advice and brief work-flow intro would be highly appreciated!

RNA-Seq gsea goterm • 1.6k views

ADD COMMENT • link updated 6.6 years ago by Biostar 20 • written 6.7 years ago by sup230 ▴ 60

1

Entering edit mode

Read Tarca et al. (2013), in particular the 2nd and 3rd paragraphs from introduction should help clarify some of your doubts.

ADD REPLY • link 6.7 years ago by h.mon 35k