Question: What Tools Are You Using To Check The Quality Of Mass Spectrometry Proteomics Data?
gravatar for Attila Csordas
8.8 years ago by
Cambridge, UK
Attila Csordas520 wrote:

You measured a lot of mass spectra and you get matching peptides and inferred proteins via search engines or spectral libraries or both. What tools are you using to cross-check the data? Delta m/z differences, precursor charge assignments, PTMs...

proteomics quality tool mass-spec • 5.1k views
ADD COMMENTlink modified 8.8 years ago by Leo50 • written 8.8 years ago by Attila Csordas520
gravatar for Laurent
8.8 years ago by
Cambridge, UK
Laurent1.6k wrote:

I think that's a very good question, and I would argue that the few answers somehow reflect the lack of well established QC methodology in proteomics (although there might be few people working in that field on Biostar), especially compared to genetics and transcriptomics. A precise answer will also depend on the sample (simple or complex mixture) and it's processing (enrichment for instance).

Probably the very first thing to do is to look at the raw data, as it is produced from the mass spec, i.e the elution profile and the raw spectra. I am always amazed how our in-house mass spec specialist can comment on the raw data and quickly assess how good the results are, or at least if the data is good enough for the question considered. To do this, you really need to know what you are running in the first place and be aware of the capabilities of your machine. Mass spectrometry is still a hands-on experiment, in comparison with more mature technologies (and technically easier) like microarray. Of course, all this requires to be where the data is generated, which might not be the case if you work as a bioinformatician and take care of data repositories for example.

IMHO, there is need for more QC steps because (1) not every body has an expert to ask and (2) having automated pipelines, that statistically asses QC for single or multiple data sets, is crucial. I think that delta m/z differences, precursor charge assignments, PTMs, MZ distributions,... as you mention, are a good start. Still , it would be important to formalise the knowledge of the mass spec gurus and implement it in programs. And btw, PRIDE inspector is a good means to quickly assess public data for meta-analysis.

Finally, you might be aware of a recent special issue of Proteomics about QC. I have not had time to read it thoroughly, so I can really point to any specific method.

Hope this helps.

ADD COMMENTlink written 8.8 years ago by Laurent1.6k

tempted to mark this question as the accepted answer :)

ADD REPLYlink written 8.8 years ago by Attila Csordas520

tempted to tag this as a great comment ;-)

ADD REPLYlink written 8.8 years ago by Laurent1.6k

I'm sorry not the 'question' but your answer. What I really had in mind besides the tools was a question concerning the problems of QC in proteomics in general but you obviously got that message too.

ADD REPLYlink written 8.8 years ago by Attila Csordas520
gravatar for Thaman
8.8 years ago by
Thaman3.2k wrote:

In case of Trans Proteomics Pipeline (TPP) which is a opensource and free collection of tools and supporting data formats which enable shotgun proteomics data analysis. If we go through TPP then we can find several validation tools in the pipeline.

alt text

In the figure you can see nodes: Protein prophet and Peptide Prophet which are validation tools for the mass spectra data which can be either searched from SEQUEST database in the TPP.

Hope it helps

ADD COMMENTlink modified 8.8 years ago • written 8.8 years ago by Thaman3.2k
gravatar for Julien
8.8 years ago by
Julien150 wrote:

Take a look at "Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses" ( They also make available a software pipeline that implements their recommended tests.

ADD COMMENTlink written 8.8 years ago by Julien150

This is what we use. It is available from It has just about every metric you can think of and then some. This is useful for assessing mass errors, charge state distributions, peak widths, etc. Very useful for determining when your LC or MS performance is changing.

To "cross-check" identification assignments theGPM's validate function is unparalleled. Search data using their servers or your own GPM installation, in the results click the protein, the peptide, the validate link. This will bring up a list of the top ten assignments to that peptide in the GPMdb.

ADD REPLYlink written 8.6 years ago by Brianbalgley100
gravatar for Leo
8.6 years ago by
United States
Leo50 wrote:

The following papers may be relevant to your question:

ADD COMMENTlink written 8.6 years ago by Leo50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1789 users visited in the last hour