Question: Seeing If Differences In Amino Acids Hydrophobicity, Between Mutant Proteins, Have Statistical Significance.
4
gravatar for Swatchpuppy
8.3 years ago by
Swatchpuppy50
Portugal
Swatchpuppy50 wrote:

Hello,

Let's imagine that we have the following 5 mutants sequence of Protein kinace C:

S1 KVLGKGSFGKVMLADDKGTEELYA 24 S2 MVLFKGSFGKVMLGDRKGTEELYA 24 S3 MVLGKGSFGKVMLADRKG-EELYA 23 S4 MVLGKGSAGKVMLADRKGTEFLYA 24 S5 MVLGKGS-GKVMLFDRKGTEELYA 23 .. ** *** ***** * ** * *** ..

And after Kyte & Doolittle computings for hydrophobicity per amino acid, the following table was obtained:

AA    S1         S2      S3        S4       S5
1    0.633     1.633    1.278    1.278    0.922
2    0.278     1.278    0.922    0.922    0.567
3    0.578     1.578    1.222    1.222    0.867
4    0.578     1.578    1.222    1.111    1.222
5    0.111     1.111    0.756    0.644    0.756
6    0.111     0.467    0.111        0    0.111
7    0.111     0.467    0.111        0    0.111
8     -0.1     0.256     -0.1   -0.211       NA
9    0.367     0.367    0.367    0.256    0.367
10       1     0.756    1        0.889    1.111
11   0.656     0.411    0.656    0.544    0.767
12   0.356         0    0.244    0.133    0.356
13  -0.389    -0.744     -0.5     -0.5   -0.389
14  -0.389    -0.744     -0.5     -0.5   -0.389
15  -0.033    -0.389   -0.144   -0.144   -0.033
16  -0.889    -1.244       -1       -1   -0.889
17  -1.489    -1.844     -1.6     -0.9   -1.489
18  -1.489    -1.844     -1.6     -0.9   -1.489
19  -1.833    -1.944       NA   -1.244   -1.944
20  -1.244    -1.356   -1.356   -0.656   -1.356
21  -0.356    -0.356   -0.356    0.344   -0.356
22  -0.356    -0.356   -0.356    0.344   -0.356
23   0.189     0.189    0.189    0.889    0.189
24   0.689     0.689    0.689    1.389    0.689

There is also an a priori knowledge regarding mutant binding to a trial molecule, and regarding mutant function:

     Binding   Functional
S1      1           1
S2      1           1
S3      2           0
S4      2           0
S5      0           1

What statistical tests would you recommend to:

1.See if the difference between means for each mutant protein hydrofobicity is statistically significant?

mu1 = mu2 = mu3 = ... = mu n

2.The same as above but comparing each pair individually?

S1  S2  S3  S4 ... Sn  
S2   p   -   -   -  -  
S3   p   p   -   -  -  
...  .   .   .   -  -
Sn   p   p   p   p  -

3.See if the the amino acid property can somewhere be related to binding and functional properties?


  1. I think that the one-way anova won't do much good because we can see them as paired samples, paired by aminoacid.
    Do you think that repeated mesures anova here can be used here?

  2. I was thinking on the pairwise.t.test for paired samples in R, with Bonferroni as Method for adjusting p values.
    What do you think of this method?


NOTE: This is fabricated data.

I have read the http://biostar.stackexchange.com/questions/4208/statistical-analysis-of-protein-sequence-properties post, and i reckon that there are a few similarities in both problems, but even so the objective are quite different.


Thanks in advance.

amino-acids protein statistics • 1.9k views
ADD COMMENTlink modified 8.3 years ago by Jarretinha3.3k • written 8.3 years ago by Swatchpuppy50

Could you please be a little bit more specific? Which means? Besides that many amino acid properties are not independent, specially on a residue basis. Can you state your test question?

ADD REPLYlink written 8.3 years ago by Jarretinha3.3k

YEs,

I think so. But i think you missunderstood the property, it is just one but an observation per amio-acid/protein.

H: Are the means of hidrophobicity different between proteins.
(Probably an one-way anova)

H: Which proteins have means of hidrophobicity different from each other. (I'm thinking about the post-hoc tests here)

H: How the means are correlated with binding.
(this probably goes for a classification problem, or a simple correlation test)

H: How the means are correlated with function.
(idem)

ADD REPLYlink written 8.3 years ago by Swatchpuppy50

YEs, I think so. But i think you missunderstood property, i meant just one observation (hidrophobicity) per amio-acid/protein. The hipotesis would be: H: Are the means of hidrophobicity different between proteins. (Probably an one-way anova) H: Which proteins have means of hidrophobicity different from each other. (I'm thinking about the post-hoc tests here) H: How the means are correlated with binding. (this probably goes for a classification problem, or a simple correlation test) H: How the means are correlated with function. (idem)

ADD REPLYlink written 8.3 years ago by Swatchpuppy50

I've forgot to ask: which scale you are using?

ADD REPLYlink written 8.3 years ago by Jarretinha3.3k

As I promised. A paired t-test with Bonferroni's correction is very similar to ANOVA. On either case, the most precise way to interpret your case is "same subject, different treatments". The tests I suggest assumed independent samples. As I said before, on an amino acid basis, hydrophobicity isn't a independent measure. Normally, methods to estimate it use a window of size 3-5 aa. Yet, you can use ANOVA for testing all pairs.

ADD REPLYlink written 8.3 years ago by Jarretinha3.3k

You are right about the need to use methods that use a window of size 3-5 aa, i don't think that for this particular analysis (hidrophobicity), but let's imagine that we are studying another property that depends on surrounding aa, what methods do you think that could be appropriate for this?

ADD REPLYlink written 8.3 years ago by Swatchpuppy50

Hydrophobicity depends on the neighbors. Check the scales at ExPASy/ProtScale. If you really really want to perform a powerful analysis to cross analyze sequence-function relationships (aa properties included) you must check Raganathan Lab (http://www.hhmi.swmed.edu/Labs/rr/). He developed the most powerful methods to date. Quite laborious, but worth a try!!!

ADD REPLYlink written 8.3 years ago by Jarretinha3.3k

Thanks for the hint. I will tell you about the results.

ADD REPLYlink written 8.3 years ago by Swatchpuppy50
3
gravatar for Jarretinha
8.3 years ago by
Jarretinha3.3k
São Paulo, Brazil
Jarretinha3.3k wrote:

Your probably right in you comment. I've checked the hydrophobicity distribution for soluble proteins. It can be safely approximated by a gaussian (inside a the protein too!). So, you can use ANOVA indeed to inquire the differences on the mean hydrophobicity. After that, a Bartlett's test would suffice to separate the groups. You can also use Levene's ou Brown-Forsythe. For the third/fourth point you could use logist regression or linear discriminant analysis.

Here, the hydrophobicity distribution for Protein kinase C, brain isozyme using Kyte & Doolittle:

-3.378 -3.11075 3
-3.11075 -2.8435 3
-2.8435 -2.57625 8
-2.57625 -2.309 11
-2.309 -2.04175 12
-2.04175 -1.7745 27
-1.7745 -1.50725 52
-1.50725 -1.24 35
-1.24 -0.97275 55
-0.97275 -0.7055 63
-0.7055 -0.43825 65
-0.43825 -0.171 83
-0.171 0.09625 66
0.09625 0.3635 65
0.3635 0.63075 53
0.63075 0.898 32
0.898 1.16525 19
1.16525 1.4325 8
1.4325 1.69975 4
1.69975 1.967 6

What do you think?

ADD COMMENTlink written 8.3 years ago by Jarretinha3.3k

To what correspond the last column?

ADD REPLYlink written 8.3 years ago by Swatchpuppy50

What is the last column?

ADD REPLYlink written 8.3 years ago by Swatchpuppy50

I think that maybe a paired sample analisys would be more powerfull. Read the next answer.

ADD REPLYlink written 8.3 years ago by Swatchpuppy50
1
gravatar for Alastair Kerr
8.3 years ago by
Alastair Kerr5.2k
The University of Edinburgh, UK
Alastair Kerr5.2k wrote:

I think that this data is ideal for a supervised multi variant analysis approach. Check out the MADE4 R package that might be able to be applied to this problem (if you can get the data into the right format). The section that applies best is the between groups analysis [BGA].

ADD COMMENTlink modified 8.3 years ago • written 8.3 years ago by Alastair Kerr5.2k

I haven't got the time to look at this package, as soon as possible i will give you feedback on this. Thanks for your help.

ADD REPLYlink written 8.3 years ago by Swatchpuppy50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1348 users visited in the last hour