Hello,
I have a dataset containing drugs and their corresponding indications. With such dataset, I used three different clustering methods to attempt to create groupings of drugs with similar indications. In order to identify which method produced the most accurate clustering to reflect my data, I completed an enrichment analysis of drug therapeutic classes for each cluster and method using Fisher's exact test.
I now have the corresponding p-values for each cluster. I am unsure of how to identify which p- values actually represent a significant enrichment of a drug class within a cluster and how to compare the enrichment results between clustering methods?
When comparing the enrichment results, I wish to state how many unique drug classes are found to be significantly enriched within each cluster and then compare with the other clustering methods. I do not know what an appropriate p-value threshold would be for my data as I have a large range in p-values when looking at the enrichment results from all three clustering methods.
Then you could use FDR-adjusted p-values and set your threshold to what is an acceptable level of false positives for you, e.g. 0.1 or 0.05 or even lower.