My dictionary says, "benchmark means a level of quality that can be used as a standard when comparing other things". So a benchmark method is a standard method which can be used when you wanna compare other methods.
As I have seen, some packages such as "edgeR", "DESeq", "EBSeq", etc. are benchmark methods in detection of DE genes. I wish to know that when you don't access simulated data and you wanna compare two methods, can you assume that "edgeR" or any other benchmark method is standard and can be used to decide which one of your methods is better? I mean, can we compare two models with "edgeR" and choose the closest one to edgeR as a better model?
I also wish to know that how do you decide which model is better when you have a real dataset? the percentage of the overlapped genes found by each method with edgeR? I know another thing we can do is to check how many of the head genes (index genes) or house keeping genes are found by each model. What more can you recommend please?