dear All,
I'd like to compare 6 gene expression datasets (multiple conditions / time-points) in terms of joint patterns in the enrichments of the differentially expressed genes. The DEG lists I have prepared (logFC,padj). GO and KEGG enrichments would be nice, the more options the better. All should be doable via a Webserver. And the enrichment method should be state-of-the-art, GSEA may be a very good choice. The geneset databases should be up-to-date.
Long ago, I used ToppCluster, which I liked because I could simultanously enter all the datasets, and it returned similar enrichments in some of the datasets in a nice table.
- Are there other good options, maybe better ones? Optimally, the tool allows further useful analyses and visualizations.
- For the time-series, considering the temporal aspect explicitly would be nice, but usually the first and the last time-point are most informative so that it boils down to two conditions.
kind regards,
georg
Please realize that GSEA has nothing to do with differential genes. It is a method that assesses global shifts/skews towards upregulation/downregulation of a predefined set of genes. In contrast, enrichment analysis for a set of (differential) genes is usually done via a hypergeometric test. It is not so much the stats behind it that is the crux but rather the choice of an appropriate background. What I'd probably do is to run enrichment analysis based on the DEGs to get terms, and then do meta-analysis on the enrichment stats, for example enrichment via clusterProfiler or g:profiler2 and then RobustRankAggreg for the aggregation. Alternatively, do meta-analysis on the DEG lists directly and then scan the genes that are reliably DEG across the majority of datasets for enrichments. The latter is probably the most robust since these enrichment tests and gene set databases are quite a mess at times due to little standardization and redundancy of genes contributing to functional terms. The latter inflates multiple testing burden quite a lot so it strongly depends on the choice of database, the number of included terms and then the mentioned background used for enrichment analysis (if any is used at all, which again influences stats a lot).