Is there an (semi)official specification and API for Proteomics data? Both for file based and distributed storage?
Like there is for Genomics data:
Specs:
File storage
http://samtools.github.io/hts-specs
Distributed storage
https://github.com/bigdatagenomics/adam
https://github.com/bigdatagenomics/bdg-formats
API:
File storage
https://github.com/samtools/htsjdk
Api for reading from distributed storage?
I remember there are formats like mzml , mzIdent and mzQuant from the HUPO Proteomics Standards Inititative. Have these taken of (being widely accepted used) as the standards for proteomics data? Is there also an API (like HTS-JDK) and a distributed storage variant (an Eva to Adam :) ) ?
Thanks very much. Also found a list tools that import or export mzIdent of PSI website: http://www.psidev.info/tools-implementing-mzidentml Don't know if these tools support also the other formats though.
In terms of identification, mzIdenML (version 1.1) is widely used. I think pepXML is still in use in the TPP suite. OpenMS also has/had it's own XML based formats (at least internally), but nowadays supports the official ones.