Question

Grateful for advice on proteomic QC

0

Entering edit mode

4.9 years ago

andrew ▴ 10

I have begun to work on a multi-omic project and the proteomic data is been available first. I am the bioinformatician, not on the lab side. A small initial batch (extract from serum, trypsin digest) was run at two laboratories (lab 1 and 2) for comparison, and a larger batch then run in only lab 2. We are keen on high sensitivity to low abundance species.

I have grave concerns about the data quality from lab 1, initially because of poor rates of spectral matching, then interrogation of QC metrics from RawTools. Although I believe I have good grounds for my concern, I don't have access to independent proteomic/mass spec expertise to confirm my suspicions.

I would be very grateful if anyone has the expertise and time to comment on the below (there are medians of sample-medians, sorry!)

Lab 1

In-gel digestion
Loaded 3 micrograms over 120 minute gradient
Orbitrap Fusion Tribrid - precursors on Orbitrap, fragments on ion trap
MS1 injection times median 0.47ms
Median 17,083 MS1s and 78,908 MS2s per run
Duty cycle median 0.43s
Median of MS1 median intensities 2.6×10^5
Median of peak parent scan intensities 7.2×10^5
Median MS2 median peak intensities 89.5

Lab 2 round 1

In solution digestion
Loaded 0.5 micrograms over ~90 minute gradient
Orbitrap Fusion Lumos - precursors and fragments on Orbitrap
MS1 injection times median 27.8ms
Median 5,039 MS1s and 24,767 MS2s per run
Duty cycle median 0.94s
Median of MS1 median intensities 5.3×10^3
Median of peak parent scan intensities 1.8×10^4
Median MS2 median peak intensities 1812 (but on Orbitrap not ion trap)

Lab 2 round 2

Same machine but fragments now on linear ion trap
MS1 injection times median 23.5ms
Median 5,212 MS1s and 28,351 MS2s per run
Duty cycle median 0.93s
Median of MS1 median intensities 8.7×10^3
Median of peak parent scan intensities 3.5×10^4
Median MS2 median peak intensities 21.9

Lab 2 have explained that they load less to avoid overloading the column and that the QC values are as they would expect. However, I find it extraordinary that MS1 accumulations are taking ~50 times as long as lab 1, and even with this the median MS1 intensity is ~50-fold lower. This makes me suspect that the rate of peptide entry into the mass spec is ~2500 times lower than lab 1 (which could not be explained by 6-fold lower loading).

My preliminary conclusion is that in lab 2:

Very small amounts of peptide are entering the mass spectrometer
Only the most abundant species have any chance of being selected for MS2
These are selected from the most intense precursors
It takes so long to accumulate for MS1 and MS2 that the we only get 1/3 of the number of MS2s at lab 1
MS2s (linear ion trap at least) are still less intense than at lab 1 and therefore have a lower chance of being ID'ed
The data from lab 2 will in no way be suitable for interrogation for low abundance proteins

However, given my lack of expertise in this area, it is difficult for me to know if my reasoning is sound.

All advice welcome!

Thanks in advance,

A

mass spectrometry proteomics quality control • 893 views

ADD COMMENT • link 4.9 years ago by andrew ▴ 10

0

Entering edit mode

So, this is another case of the bioinformatician being dumped with a mess / terrible experimental design? Who even made the decision to run a proteomics experiment in 2 different labs? <- I hope not you (?). Does each lab even profile the same proteins? It would only make sense if this were some trial experiment and that you were going to check each run independently for QC before making a final decision on which lab to use for a larger run.

The QC metrics that you've listed in your question are, for me, things on which the labs running the experiment itself should advise. As an analyst, I would be interested in checking histograms, box-and whisker & scatter plots, etc.

ADD REPLY • link 4.9 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks Kevin. The use of two labs was indeed part of a pilot, so a small batch of samples were run in lab 1 and lab 2. However, I became involved after lab 2 was selected, and believe this was for reasons beyond any detailed quality control.

Lab 2 (indirect contact) have already said they don't share my concern, and the results are as they would be expected. However, I find it hard to accept because:

MS1 accumulations are taking ~50 times as long as lab 1, and even with this the median MS1 intensity is ~50-fold lower

I'm hoping that anyone who has significant experience with RAW data coming from Orbitrap MS proteomics experiments would have a view! Unfortunately I'm struggling to find that locally.

ADD REPLY • link 4.9 years ago by andrew ▴ 10

0

Entering edit mode

Yes, I replied because I felt that this would just go unanswered. This is an atypical question. Long ago, I worked as a lab assistant in a lab at the University of York that was building its own mass spectrometer. There are probably a whole host of reasons for the lower intensity after longer accumulation time, e.g., difference columns, elution rates, solvent, etc.

Apart from this, how does the data look in histograms, boxplots, etc?

We recently ran 2 runs separately at Francis Crick Institute in London and observed good reproducibility between both runs.

ADD REPLY • link 4.9 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks for stepping in. Looking around I can see that it's atypical - perhaps I should have tried ResearchGate!

I'm not sure exactly what histograms and box plots you're referring to, but I have QC metrics from RawTools (from which the measures above are derived).

My initial concern was caused by the low numbers of identified spectra. I should add that the ID rate at 1% FDR averaged 4% across lab 2 and 14% with lab 1. A relatively low rate would not be unexpected because the samples are rich in immunoglobulin, but 4% seemed much too low.

ADD REPLY • link 4.9 years ago by andrew ▴ 10