Library To Generate Theoretical Tandem Mass Spectra From Arbitrary Peptide Sequences
2
2
Entering edit mode
11.7 years ago
Jdnavarro ▴ 410

I know there are some desktop applications where you can input some peptide sequences and it generates the coordinates of the expected fragments for a tandem MS/MS spectrum (normally y and b ions fragmented with CID). I would like to embed that functionality in a program I'm writing. Does anyone know a library or a simple command line program to do so?

Alternatively I would like to know an open source program where extracting that functionality woulde be easy enough.

proteomics library • 5.0k views
3
Entering edit mode
11.7 years ago

I don't know of a library I'm afraid, though I would also be interested to know if there is one. TheGPM makes extensive use of theoretical tandem mass spectra (as you would expect). The most accessible implementation I could find is in the web front end, in the perl script that generates the ion table you see in the peptide level view.

TheGPM is available under the Artistic License and you can grab the particular Perl script I reference above from here. That should give you some pretty good pointers. Have a look at get_y() and get_b(). The full repository is worth a look, most of the heavy lifting is done in C++.

0
Entering edit mode

That script is quite straightforward to port to my current application. I'll try to release my ported version as a Python library.

3
Entering edit mode
7.8 years ago
Laurent ★ 1.7k

Using MSnbase, you can

> library("MSnbase")
> data(itraqdata)
> xx <- pickPeaks(itraqdata)
> i <- 14
> s <- as.character(fData(xx)[14, "PeptideSequence"])
> calculateFragments(s)
Modifications used: C=160.030649
mz ion type pos z         seq
1    88.03931  b1    b   1 1           S
2   201.12337  b2    b   2 1          SI
3   258.14483  b3    b   3 1         SIG
4   405.21324  b4    b   4 1        SIGF
5   534.25583  b5    b   5 1       SIGFE
6   591.27729  b6    b   6 1      SIGFEG
7   706.30423  b7    b   7 1     SIGFEGD
8   793.33626  b8    b   8 1    SIGFEGDS
9   906.42032  b9    b   9 1   SIGFEGDSI
10  963.44178 b10    b  10 1  SIGFEGDSIG
11 1119.54289 b11    b  11 1 SIGFEGDSIGR
12  175.11895  y1    y   1 1           R
13  232.14041  y2    y   2 1          GR
14  345.22447  y3    y   3 1         IGR
15  432.25650  y4    y   4 1        SIGR
16  547.28344  y5    y   5 1       DSIGR
17  604.30490  y6    y   6 1      GDSIGR
18  733.34749  y7    y   7 1     EGDSIGR
19  880.41590  y8    y   8 1    FEGDSIGR
20  937.43736  y9    y   9 1   GFEGDSIGR
21 1050.52142 y10    y  10 1  IGFEGDSIGR
22 1137.55345 y11    y  11 1 SIGFEGDSIGR


Or, if you have a spectrum to match peaks:

> calculateFragments(s, xx[[i]])
Modifications used: C=160.030649
mz ion type pos z         seq        error
12  175.1172  y1    y   1 1           R  0.001778759
2   201.1180  b2    b   2 1          SI  0.005339267
13  232.1375  y2    y   2 1          GR  0.002868275
14  345.2171  y3    y   3 1         IGR  0.007368949
15  432.2482  y4    y   4 1        SIGR  0.008271020
16  547.2679  y5    y   5 1       DSIGR  0.015496664
17  604.2875  y6    y   6 1      GDSIGR  0.017364379
7   706.2938  b7    b   7 1     SIGFEGD  0.010463793
18  733.3439  y7    y   7 1     EGDSIGR  0.003618930
19  880.4140  y8    y   8 1    FEGDSIGR  0.001899535
20  937.4040  y9    y   9 1   GFEGDSIGR  0.033369301
21 1050.5042 y10    y  10 1  IGFEGDSIGR  0.017270609
11 1119.5449 b11    b  11 1 SIGFEGDSIGR -0.002035875
22 1137.5352 y11    y  11 1 SIGFEGDSIGR  0.018294750


More details: ?calculateFragments

An application

> MSnbase:::.plotSingleSpectrum(xx[[i]], s)


I will properly export and document the last unexported function, which is currently using plot,Spectrum,Spectrum.