Question

Computing percentage of secondary structured and disordered residues from two text files..

0

Entering edit mode

7.8 years ago

basuraj1991 • 0

Hi Friends... I am just a beginner in Python...I want to analysis secondary structural properties of more than 100s of protein sequences....I have already obtained secondary structure and disordered region analysis text files separately from two different servers...Both the files look as follows...

Secondary structure file (ss.txt)

Index AA SS ASA Phi Psi Theta(i-1=>i+1) Tau(i-2=>i+1) P(C) P(E) P(H)

1 M C 131.5 -93.4 122.8 112.2 -162.9 0.928 0.031 0.047

2 H C 114.1 -96.4 121.2 112.4 -161.9 0.683 0.207 0.080

3 I C 83.0 -86.7 106.4 107.8 -157.8 0.556 0.326 0.092

4 Q H 114.2 -82.3 -18.9 107.9 157.8 0.586 0.180 0.173

5 S H 77.6 -88.9 41.5 108.3 99.6 0.624 0.199 0.139

6 L C 95.5 -81.4 25.1 102.4 -177.8 0.860 0.052 0.071

7 G S 48.5 91.0 -16.2 110.2 -29.4 0.843 0.044 0.113

8 A S 57.7 -83.1 136.6 113.8 91.4 0.800 0.112 0.106

Disorder file (dis.txt)

Index AA Binary Probability

1  M      D    0.97272
2  H       D   0.96426
3  I        O   0.96352
4  Q       O    0.96778
5  S       O    0.97184
6  L       D    0.97648
7  G      D    0.97955
8  A      O    0.98359

Giving the priority to disorderedness wanted to write a script....to replace secondary structure elements (H or S or C) with a Disordered element in protein sequence wherever applicable....and finally calculate the percentage of H, S, C and D residues for all the sequences..... I have save the secondary structure dataset files as 1.ss.txt, 2.ss.txt,3.ss.txt etc and disorderd data files as 1.dis.txt, 2.dis.txt, 3.dis.txt respectively...1.ss.txt and 1.dis.txt are for protein no 1...

sequence alignment secondry structural analysis • 1.5k views

ADD COMMENT • link 7.8 years ago by basuraj1991 • 0