Question

Seeing If Differences In Amino Acids Hydrophobicity, Between Mutant Proteins, Have Statistical Significance.

4

Entering edit mode

13.2 years ago

Swatchpuppy ▴ 50

Hello,

Let's imagine that we have the following 5 mutants sequence of Protein kinace C:

S1 KVLGKGSFGKVMLADDKGTEELYA 24 S2 MVLFKGSFGKVMLGDRKGTEELYA 24 S3 MVLGKGSFGKVMLADRKG-EELYA 23 S4 MVLGKGSAGKVMLADRKGTEFLYA 24 S5 MVLGKGS-GKVMLFDRKGTEELYA 23 .. ** *** ***** * ** * *** ..

And after Kyte & Doolittle computings for hydrophobicity per amino acid, the following table was obtained:

AA    S1         S2      S3        S4       S5
1    0.633     1.633    1.278    1.278    0.922
2    0.278     1.278    0.922    0.922    0.567
3    0.578     1.578    1.222    1.222    0.867
4    0.578     1.578    1.222    1.111    1.222
5    0.111     1.111    0.756    0.644    0.756
6    0.111     0.467    0.111        0    0.111
7    0.111     0.467    0.111        0    0.111
8     -0.1     0.256     -0.1   -0.211       NA
9    0.367     0.367    0.367    0.256    0.367
10       1     0.756    1        0.889    1.111
11   0.656     0.411    0.656    0.544    0.767
12   0.356         0    0.244    0.133    0.356
13  -0.389    -0.744     -0.5     -0.5   -0.389
14  -0.389    -0.744     -0.5     -0.5   -0.389
15  -0.033    -0.389   -0.144   -0.144   -0.033
16  -0.889    -1.244       -1       -1   -0.889
17  -1.489    -1.844     -1.6     -0.9   -1.489
18  -1.489    -1.844     -1.6     -0.9   -1.489
19  -1.833    -1.944       NA   -1.244   -1.944
20  -1.244    -1.356   -1.356   -0.656   -1.356
21  -0.356    -0.356   -0.356    0.344   -0.356
22  -0.356    -0.356   -0.356    0.344   -0.356
23   0.189     0.189    0.189    0.889    0.189
24   0.689     0.689    0.689    1.389    0.689

There is also an a priori knowledge regarding mutant binding to a trial molecule, and regarding mutant function:

     Binding   Functional
S1      1           1
S2      1           1
S3      2           0
S4      2           0
S5      0           1

What statistical tests would you recommend to:

1.See if the difference between means for each mutant protein hydrofobicity is statistically significant?

mu1 = mu2 = mu3 = ... = mu n

2.The same as above but comparing each pair individually?

S1  S2  S3  S4 ... Sn  
S2   p   -   -   -  -  
S3   p   p   -   -  -  
...  .   .   .   -  -
Sn   p   p   p   p  -

3.See if the the amino acid property can somewhere be related to binding and functional properties?

I think that the one-way anova won't do much good because we can see them as paired samples, paired by aminoacid.
Do you think that repeated mesures anova here can be used here?
I was thinking on the pairwise.t.test for paired samples in R, with Bonferroni as Method for adjusting p values.
What do you think of this method?

NOTE: This is fabricated data.

I have read the http://biostar.stackexchange.com/questions/4208/statistical-analysis-of-protein-sequence-properties post, and i reckon that there are a few similarities in both problems, but even so the objective are quite different.

Thanks in advance.

statistics protein amino-acids • 3.7k views

ADD COMMENT • link updated 13.2 years ago by Jarretinha 3.4k • written 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

Could you please be a little bit more specific? Which means? Besides that many amino acid properties are not independent, specially on a residue basis. Can you state your test question?

ADD REPLY • link 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

YEs,

I think so. But i think you missunderstood the property, it is just one but an observation per amio-acid/protein.

H: Are the means of hidrophobicity different between proteins.
(Probably an one-way anova)

H: Which proteins have means of hidrophobicity different from each other. (I'm thinking about the post-hoc tests here)

H: How the means are correlated with binding.
(this probably goes for a classification problem, or a simple correlation test)

H: How the means are correlated with function.
(idem)

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

YEs, I think so. But i think you missunderstood property, i meant just one observation (hidrophobicity) per amio-acid/protein. The hipotesis would be: H: Are the means of hidrophobicity different between proteins. (Probably an one-way anova) H: Which proteins have means of hidrophobicity different from each other. (I'm thinking about the post-hoc tests here) H: How the means are correlated with binding. (this probably goes for a classification problem, or a simple correlation test) H: How the means are correlated with function. (idem)

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

I've forgot to ask: which scale you are using?

ADD REPLY • link 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

As I promised. A paired t-test with Bonferroni's correction is very similar to ANOVA. On either case, the most precise way to interpret your case is "same subject, different treatments". The tests I suggest assumed independent samples. As I said before, on an amino acid basis, hydrophobicity isn't a independent measure. Normally, methods to estimate it use a window of size 3-5 aa. Yet, you can use ANOVA for testing all pairs.

ADD REPLY • link 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

You are right about the need to use methods that use a window of size 3-5 aa, i don't think that for this particular analysis (hidrophobicity), but let's imagine that we are studying another property that depends on surrounding aa, what methods do you think that could be appropriate for this?

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

Hydrophobicity depends on the neighbors. Check the scales at ExPASy/ProtScale. If you really really want to perform a powerful analysis to cross analyze sequence-function relationships (aa properties included) you must check Raganathan Lab (http://www.hhmi.swmed.edu/Labs/rr/). He developed the most powerful methods to date. Quite laborious, but worth a try!!!

ADD REPLY • link 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

Thanks for the hint. I will tell you about the results.

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

Ram · Answer 1 · 2011-02-04

3

Entering edit mode

13.2 years ago

Jarretinha 3.4k

You're probably right in you comment. I've checked the hydrophobicity distribution for soluble proteins. It can be safely approximated by a gaussian (inside a the protein too!). So, you can use ANOVA indeed to inquire the differences on the mean hydrophobicity. After that, a Bartlett's test would suffice to separate the groups. You can also use Levene's ou Brown-Forsythe. For the third/fourth point you could use logist regression or linear discriminant analysis.

Here, the hydrophobicity distribution for Protein kinase C, brain isozyme using Kyte & Doolittle:

-3.378 -3.11075 3
-3.11075 -2.8435 3
-2.8435 -2.57625 8
-2.57625 -2.309 11
-2.309 -2.04175 12
-2.04175 -1.7745 27
-1.7745 -1.50725 52
-1.50725 -1.24 35
-1.24 -0.97275 55
-0.97275 -0.7055 63
-0.7055 -0.43825 65
-0.43825 -0.171 83
-0.171 0.09625 66
0.09625 0.3635 65
0.3635 0.63075 53
0.63075 0.898 32
0.898 1.16525 19
1.16525 1.4325 8
1.4325 1.69975 4
1.69975 1.967 6

What do you think?

ADD COMMENT • link updated 4.6 years ago by Ram 43k • written 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

To what correspond the last column?

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

What is the last column?

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

0

Entering edit mode

I think that maybe a paired sample analisys would be more powerfull. Read the next answer.

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50

score 1 · Answer 2 · 2011-02-02

1

Entering edit mode

13.2 years ago

Alastair Kerr 5.3k

I think that this data is ideal for a supervised multi variant analysis approach. Check out the MADE4 R package that might be able to be applied to this problem (if you can get the data into the right format). The section that applies best is the between groups analysis [BGA].

ADD COMMENT • link 13.2 years ago by Alastair Kerr 5.3k

0

Entering edit mode

I haven't got the time to look at this package, as soon as possible i will give you feedback on this. Thanks for your help.

ADD REPLY • link 13.2 years ago by Swatchpuppy ▴ 50