Problems With A Python Script Used To Simulate Protein Cutting By Trypsin
3
0
Entering edit mode
9.6 years ago

Hello I have problem in my python script, which should simulate protein cutting by trypsin(cuts after K and R but no if next is P):

for i in range(len(s)-1):
        if ( (s[i]=='K' or s[i]=='R')and (s[i+1]!='P')):
                f.append(s[n:i+1])
                n=i-1

The problem is that somewhere it will work properly and dont cut sequence with KP or PR but somwhere it will cut it, dont you know why????? THX for answer!!! zuzka

python • 3.5k views
ADD COMMENT
7
Entering edit mode
9.6 years ago
Xtof ▴ 170

Hi

I'm not sure why it doesn't work, but you might consider using a regular expression instead

import re
pattern = re.compile('[KR][^P]') #it is the regular expression. it means K or R and follow by anything but P
peptides = pattern.split(sequence) #the split method split on the re you define juste before and return a list

if I try with sequence = 'LTRPTGKJHIKPTHHKTTGHV'

it returns peptides = ['LTRPTG', 'HIKPTHH', 'TGHV']

it should work for you

ADD COMMENT
0
Entering edit mode
9.6 years ago

I think you should better use regular expressions to do this.

Then, in the other answer by Xtof, use

 '[^P][KR][^P]'

as the reg-ex

However, if you want to do it the clumsy way, and also want to prevent PR to be cut you must also check for it.

Something like:

if ( (s[i]=='K' or s[i]=='R') and (s[i+1]!='P') and  (s[i-1]!='P'))

should work in your if clause.

ADD COMMENT
0
Entering edit mode
9.6 years ago
Geparada ★ 1.5k

I think your code doesn't work because the "n" object is not defined before the "if ( (s[i]=='K' or s[i]=='R')and (s[i+1]!='P')):" and even if you define "n" before de if, when you subset "s" between "n" and "i+1" you won't get the property fragment.

I made and adaptation of your code and scripts that have two loops, the fist one get the "cuts points" and the other loop make the cuts and store in "f":

cuts = [0]
f = []

def main(s):

    for i in range(len(s)-1):
            if ( (s[i]=='K' or s[i]=='R')and (s[i+1]!='P')):
                    cuts.append(i)
    cuts.append(len(s))        

    for a, b in zip(cuts, cuts[1:]):
        f.append(P[a:b])

    print f

main("LTRPTGKJHIKPTHHKTTGHV")

That print the following list:

['LTRPTG', 'KJHIKPTHH', 'KTTGHV']

If you want to optimise this code, you can fusion the two loops and do all in one loop, as you attempted with your code, but I think this way to write the code is more intuitive and easy to read (at least for me).

Cheers,

ADD COMMENT

Login before adding your answer.

Traffic: 1866 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6