Question: Calculating z-scores and p-values from ratios
0
gravatar for mimA
2.2 years ago by
mimA30
European Union
mimA30 wrote:

Hello all,

I have a question that I can't seem to figure out. I have protein data for 2 different treatments (3 samples each) and only 1 control sample. I want to run some statistics to find differences between the 2 treatment conditions but using the control because it is important for this experiment. Someone suggested to me to calculate FC for each treatment sample using the one control sample and then converting these ratios to z-scores. I have converted them to z-scores now (so for each protein I have 3 z-scores in treatment 1 and 3 z-scores in treatment 2. Is this way acceptable? Also I'm wondering how to get 1 p-value out of these for each protein?

Thanks a lot!

ADD COMMENTlink modified 2.2 years ago by Petr Ponomarenko2.6k • written 2.2 years ago by mimA30
0
gravatar for shunyip
2.2 years ago by
shunyip180
shunyip180 wrote:

First, let me assume that you have protein expression data where you have n proteins and 3 samples for each treatment.

The person's suggestion of using FC should be done in this way:

1, calculate the mean expression of each protein, using the 3 samples.

2, divide each protein's mean expression by the control's expression, to obtain fold changes.

3, perform a logarithm (usually base 2) to obtain log fold changes.

4, now, you have a population of log2 fold changes. You can calculate a mean and a standard deviation from the fold changes.

5, Using the mean and stdev of the log2 fold changes, calculate z scores for each protein and then their p value.

I hope this helps,

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by shunyip180

Thanks a lot for your reply Shunyip. But I'm just wondering z-score is calculated as (x-mean)/sd. Here x then becomes log2FC values so i end up with 2 z-scores for each protein again

ADD REPLYlink written 2.2 years ago by mimA30
0
gravatar for Petr Ponomarenko
2.2 years ago by
United States / Los Angeles / ALAPY.com
Petr Ponomarenko2.6k wrote:

Hi mimA,

It looks like you are trying to test if there is a difference between two treatments while you have one control and 3 samples for each treatment. Your null hypothesis H0 here should be that there is no difference under different treatments. That way you can find p-value for that null hypothesis to be true under given data. That way your question was asked on other forums already, i.e. http://stats.stackexchange.com/questions/62558/test-difference-between-samples-with-very-small-sample-size http://stats.stackexchange.com/questions/37993/is-there-a-minimum-sample-size-required-for-the-t-test-to-be-valid

In short, everything depends on your assumption on variance distribution for each treatment and if you can assume these variances to be equal between two treatments.

If very little information is known about distributions and sample sizes are small than rank tests like Mann–Whitney–Wilcoxon test is a safer approach. https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Petr Ponomarenko2.6k

Thanks for your answer Petr. The only thing I'm concerned about is whether its acceptable to use ratios for example like I calculated by dividing with control and use those numbers directly to calculate p-values for example using something like limma?In this way the levels of proteins in each treatment become relative to the control. Do you have any thoughts on that?

ADD REPLYlink written 2.2 years ago by mimA30

Thanks for your answer Petr. The only thing I'm concerned about is whether its acceptable to use ratios for example like I calculated by dividing with control and use those numbers directly to calculate p-values for example using something like limma?In this way the levels of proteins in each treatment become relative to the control. Do you have any thoughts on that?

ADD REPLYlink written 2.2 years ago by mimA30

The way you normalize the data first depends on your experiment and your assumptions about distributions of observable values. In some situations, your approach can work. Could you please describe experiments in more detail?

ADD REPLYlink written 2.2 years ago by Petr Ponomarenko2.6k

I have mass spec data which represents proteins in quan values. These values are quite big (in thousands ex. 1000, 1300 2200 and so on). I have been told by our proteomics facility that no further normalisation is required for this data. However, since we wanted expression levels relative to our control we divided treatment quan values by the control quan values thus resulting into ratios for each replicate of treatment 1 as well as treatment 2. To determine differences between the 2 treatments, I was thinking could I calculate now the average of these ratios for treatment1 and average of ratios for treatment2 and simply calculate a fold-change between them and calculate p-values using lets say a t-test

ADD REPLYlink written 2.2 years ago by mimA30

Yes mimA, you can. That is the right approach. You can use t-test if you have some good reasoning to assume normal distribution for your samples. Averaging on 3 samples for each treatment is ok for t-test.

ADD REPLYlink written 2.2 years ago by Petr Ponomarenko2.6k

ok will do that, thank you!

ADD REPLYlink written 2.2 years ago by mimA30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1788 users visited in the last hour