Every time I run differential expression, I prefer to run enrichment analysis on heatmap clusters rather than all differentially expressed genes because sometimes you get apparent subrgoups within your cohort. This is especially true in transcriptome of human patients, with different clinical variables. In my case, I got a few clusters with very high enrichment scores (from EnrichR) while some clusters of pretty much the same size (~50-100 genes) get very low or even zero significant enrichments. The same is true for protein-protein interaction networks. Should clusters like these be ignored? I assume this is unwanted noise. Even though those genes were differentially expressed, they might have showed up randomly. Has this been discussed by other authors?
Depending on what method you are using to cluster your data, and how many samples you have, you could be clustering on noise. It could also be the case that they are legitimate clusters, but the clusters are not biologically meaningful. This would really be a case by case basis.