Question: How To Calculate Fold Change And Its Significance In Quantitative Proteomics
gravatar for Woa
9.8 years ago by
United States
Woa2.8k wrote:

I would like to know the (statistical) methods that are commonly used for measuring diffrential protein expression in quantitative proteomics study, particularly when replicates are involved in 'treatment' and 'control' groups. In the diagram, the numerical values indicate protein expression levels in some arbitrary unit.

Link to Diagram

proteomics statistics • 10k views
ADD COMMENTlink modified 9.6 years ago by Julian200 • written 9.8 years ago by Woa2.8k
gravatar for Julian
9.8 years ago by
Manchester, UK
Julian200 wrote:

In addition to what Laurent has answered, I would say that "common" statistical methods are only just beginning to be applied in Proteomics. One of the key areas I look to for statistical methodology is the Microarray field, although you have to have a bit of care with assumptions when dealing with quantitative data.

One of your major choices will be how to extract the data into suitable figures - something like one of the standards may be a good way to go (e.g., mzXML, mzML, mzIdentML). However, be aware that a lot of Mass Spec. machines automatically process the data prior to actually outputing any data. As a result, you have to ask what is raw data, and what the values you get out actually mean. The only hope is that all proteins in your samples will be treated in the same manner (providing you have used the same Mass Spec. to gather all the data from your samples).

Then as Laurent says, you'll need to consider normalising across all your data sets to account for differences in amounts of protein put on a gel, or on a LC, or whatever you have done with your sample. You might in your experimental design have considered these with putting in standards, etc.

After all that, you might also want to look at doing some principle component analysis, to ensure that the differences in your samples are principally down to your "observations" and not some other factors such as when the samples were processed.

From that, you might want to set up some anova's (or if appropriate, just a simple t-test), but make sure you are making the right assumptions with your data and experiments.

In terms of software - here is what I use: if you have money, you might want to consider Non-linear's Progenesis - it's not perfect but it allows you to take a look at the data with some statistical rigor. It is quite constricted in the pathway for looking at data, but seems to work well especially with using the raw data. If you don't have money, look at some of the R stuff (particularly within the BioConductor package) - it's always progressing in terms of development and if you can get the data in the right format it works well, although it can be a handful if you aren't used to command-line processing. Also worth a look is software like MaxQuant from the Mann lab - but getting data into it can be a trial. When last I played with it, it didn't easily support Mascot identifications. If you are doing things like SILAC, then there is also software like SILACAnalyzer, which is part of the OpenMS suite of tools for proteomics data analysis.

ADD COMMENTlink written 9.8 years ago by Julian200

Even if you have money, take the time to look at some open-source alternatives. Many of these come with a great deal of voluntary support that commercial companies can hardly compete with. Of course, it takes a bit of effort and learning, but it will pay.

ADD REPLYlink written 9.8 years ago by Laurent1.7k

Thanks Laurent and Julian.

ADD REPLYlink written 9.8 years ago by Woa2.8k
gravatar for Laurent
9.8 years ago by
Cambridge, UK
Laurent1.7k wrote:

This is a bit of an open question. You don't need any specific method tailored for quantitative proteomics data. Any appropriate statistical method should perform well as long as the basic requirements are expected. You might also think about data normalisation and possibly quality control.

You might get more precise answers by more people if you provide more details about your data, how you has been generated and how you performed quantitation.

Hope this helps.

ADD COMMENTlink written 9.8 years ago by Laurent1.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1909 users visited in the last hour