I am fairly certain there is no such standard, but I'm also fairly certain some other people must have thought about this.
One advantage of a standard format is that it would simplify the running of multiple enrichment tools in parallel and comparing or combining results. This is particularly useful to us within the GO consortium, as we would like to compare analyses between newer/older versions of the ontology and annotations. A more ambitious aim is for publications that include GO enrichment results to provide these in a standard format, to simplify replicating results.
Note that it would not be necessary for all tools to be conformant in order for the standard to be successful. Converters could be provided to rewrite the ad-hoc output of heterogeneous tools to the standard form. However, it would help to have buy-in from some of the more popular tools.
I have listed some desiderata for such a standard:
- An abstract specification with different serializations for different purposes (tabular, JSON, XML, RDF)
- Use of ontology terms in place of free text to describe algorithms, parameters and data processing (for example, the Ontology for Biomedical Investigations (OBI) has a rich collection of these)
- Tool name + algorithm + version
- Input token list + token type (e.g. symbol)
- Background token list + token type (if provided)
- Token-gene ID mapping (plus unmatched tokens)
- Algorithm parameters (cut-offs, algorithm selected, etc)
- Ontology id + version
- gene association set id / species + version
- List of results - for each result:
- term ID
- optional term metadata
- list of gene IDs (+ optional gene metadata)
- scoring metadata (p-vals, rank, etc)
- Unique identifier/URI for the results
- Metadata on input token set (e.g. "genes up-regulated in diabetes")
- graphical output
Is is this of general interest? If so, does the above sound like a good start, and what would be an appropriate forum for future discussions? Is there an existing tool whose output might be a good candidate for standardization?