I am using
simplifyGO package to get a summarized version of a big GO enrichment analysis output. I'm using it to get a clustering of my enriched GO terms based on a semantic similarity matrix (that can be based on methods like Resnik, Simrel, etc). As a result I get my list of GO terms which now carry the cluster number they belong to.
So far so good.
Now I want to show in my GO enrichment visualization, for each cluster of GO terms I obtained, only the most specific GO term (i.e. the most downstream, or wrongly said the "most child" one) as I wouldn't like to show the wordcloud that this package could show.
- Is there a way to accomplish this?
- Moreover I'd like to summarize the outputs for multiple set enrichments with a fair dotplot/heatmap (so kind of showing a fair summary of multiple GO enrichment summaries in the same plot). So far the only idea I came up with was taking for each cluster, the most shared GO across the sets (i.e. if a GO is enriched in 5 of my 6 sets, I'd prefer it over another one enriched in 3 out of 6). Then if multiple GO's are of the same cluster are enriched for the same number of sets, I'd prefer the one that showed the lowest average p-value.
Any suggestion (packages, functions, programs) would be very welcome.