Proteins databank for proteomics MS proteins identification in RAT
5.8 years ago
bzhtgb ▴ 10

Hi all,

My question is relative to the identification of proteins by mass spectrometry.

As you now, the last step of identification is to match identified peptides to a bank of proteins.

For human, a standard bank to use is swissprot, refseq, ... Swissprot part of uniprot is the way to identify only proteins with a high level of confidence (manually reviewed..)

But my question concern identification in Rat. There is only ~8000 rat proteins in swissprot. If we used swissprot + trembl, we will capture a lot of false positives (fragments, false splice variant , ...). Do you Know what is the gold standard for Rat? Which banks are commenly used for Rat in proteomics?

Thanks in advance.


5.8 years ago

What's wrong with using UniProt (>35000 rat proteins) ? I think this is what is typically used. Fragments shouldn't be too much of a problem if you want to identify genes and don't care about whole proteins.
UniProt also provides proteomes by combining genome annotations with SwissProt.
If you're really worried about the quality of the protein sequences, you could build your own database in which you could include proteins based on phylogenetic analysis and/or other criteria.

5.7 years ago

We often use a combination of rat and mouse protein sequences for this, e.g. the protein sets from UniProt proteomics or Ensembl. The reasons for this are that the rat genome annotation leaves quite a bit to be desired, the mouse genome annotation is a lot better, and the mouse and rat sequences have high sequence identity. By "padding up" the rat protein set with mouse proteins, we are thus able to increase the number of peptide identifications.

This strategy does come at the cost of more complex downstream analysis, since some identifications will have been mapped to mouse proteins instead of rat proteins. These thus have to subsequently be mapped back based on orthology.


