I've found Enrichr to be useful, and I can say that the tables are scored by the combined score and there are a fair number of experiments that identify relevant categories among the top ~10 gene sets with at least one reference set (ChEA 2016, GO, KEGG, etc.).
In terms of answering your question about how the combined score is calculated, Chen et al. 2013 describe the combined score is described as
c = log(p) * z, where
c = the combined score,
p = Fisher exact test p-value, and
z = z-score for deviation from expected rank. So, I think that is how the combined score is being calculated. In a downloaded table for the example differentially expressed gene list, I can replicate that calculation (if using the natural log for transforming the unadjusted p-value).
In terms of the z-score calculation, the In Enrichr: What is "Gene weight" or "levels of membership"? Biostar discussion helped me see this Help Center that describes calculation of the background for the z-score from a pre-defined table with random gene lists.