Question: Mass Spec: How to get unmatched MS spectra?
gravatar for nattzy94
6 months ago by
nattzy9420 wrote:


I am trying to analyze MS data for novel proteins. So far my workflow has been to take mzml files and convert them to mgf files. I then search the mgf files against a fasta file containing annotated protein sequences from Uniprot (this amounts to ~20,000 proteins). The search is done by using SearchGUI.

After doing this, I obtain txt files containing the proteins that were found and the spectra that matched these proteins. What I want to do is to search the unmatched spectra against a customised database in order to discover novel proteins. Similar to how this paper (Erady, 2020) describes it:

In order to evade the increase in false-positive rates, MS data is first mapped to known proteins in UniProt database, and then the unmatched spectra are mapped to the custom proteogenomic database as done by us previously in Prabakaran et al.

This seems like a pretty common thing to do as I've seen a number of papers describe it. However, I can't figure out a way to do it. Does anyone have some experience doing this?

mass spec proteomics • 139 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by nattzy9420

unfortunately no, I've only created databases of useful and relevant organisms matching what I expected in that sample. I understand the desire to not concatenate multiple fastas into one huge database that would interfere with PSM probabilities. What if spectral counts actually match better to proteins in the second database? I don't really like this method. Is it commonly used?

Let me know if you come up with a solution :(

ADD REPLYlink written 6 months ago by N15120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 696 users visited in the last hour