Is There A Single Letter Protein Alphabet That Encodes Modified Residues?
2
3
Entering edit mode
12.4 years ago

Hi,

I'm interested in protein modifications and was wondering if there is an already existing standard to express modified residues. Like instead of pS (Phospho-Serine) use the character { or something. Ideally it could cope with multiple modifications of the same residue, i.e. tri-methylation.

protein sequence annotation • 3.5k views
ADD COMMENT
2
Entering edit mode

Must they be letters? # could be phospho-threonine, $ = phospho-Ser and ^ = phospho-Tyr.

ADD REPLY
0
Entering edit mode

It's been a while since biochem, but I don't think there are enough letters for that. There are 20 AAs and only 26 letters in the alphabet.

ADD REPLY
0
Entering edit mode

Don't have to be letters, just characters. So yeah ^ & etc would work.

ADD REPLY
0
Entering edit mode

Hi Niallhaslam,

Do you feel your question was addressed? If yes, could you please accept an answer and if no, do let us know!

Thank you!

ADD REPLY
2
Entering edit mode
12.4 years ago

an already existing standard to express modified residues

There is an ontology for the modified amino acids:

ADD COMMENT
2
Entering edit mode
12.4 years ago

It seems to me that the question could be "Is there a single letter symbol that encodes amino acid modifications?" as you could always align the modified protein to its "naked" sequence:

MALLIVSDFKvDGSTWP
......p....s.....

Some ideas for these single codes:
p = phosphorylated residue
m = methylated
c = carboxylated
a = acetylated
s = sumoylated
m = myristylated

and so on. This sort of referring to a reference is how a lot of human genome data will be organized, stored, displayed, especially as personal genomics grows.

ADD COMMENT
0
Entering edit mode

I could see myself implementing this using letter_annotations in biopython. Thanks.

ADD REPLY
0
Entering edit mode

You're welcome. Remember, BioStar does well at what it is supposed to do by voting good responses/questions up and bad ones down. What you may need to consider with respect to the above is tissue or temporal specificity. Imagine looking at cell cycle regulators where the "p" for phosphorylation will be present in say G1 and not in S phase, or for a different protein only in liver and kidney but not in muscle.

ADD REPLY

Login before adding your answer.

Traffic: 1583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6