What is the minimal recommended cutoff for fold change for such variable samples?
Entering edit mode
4.1 years ago
harelarik ▴ 90


We have proteomics results from ~35 mammals. The samples were taken by surgery of the same organ in all individuals.

Since we are working with individual mammals (as opposed to gnetically identical tissue cultures, or plants) there is high variability between samples. Therefore we cannot use for example FDR to filter TTEST results that identify differentially translated proteins between treated and non-treated individuals. IF we do, there are hardly any significant DE proteins. As a results we use only p<0.05 from TTEST to identify DE proteins (no FDR), and number of DE proteins is affected mainly by fold change cutoffs.

My questions are: 1. What is the minimal recommended cutoff for fold change for such variable samples? 2. How to choose lower cutoff that still makes sense? And is still acceptable for publication in a Fine journal.

I have seen for example that Yuan et al., (2016, in: Journal of Proteomics, https://doi.org/10.1016/j.jprot.2020.103683) have used cutoff of 1.3 fold change (and pvalue<0.05) for samples from human (i.e., they have worked with mammal samples like us).

3. Does anyone know on other works with such fold cutoff or lower that were published in reasonable Journals?

Thank you, Arik

Proteomics Cutoff • 4.1k views
Entering edit mode
4.1 years ago

A few things first:

If you use a pvalue cut off on TTEST results across the whole proteome most of your results will be false positives. For example, if you detected 8,000 proteins and did therefore did 8,000 tests, you will have around 400 false positives (8000*0.05). If you find 500 proteins p< 0.05, then 80% of them will be false positives. I suggest using FDR with a higher threshold than 0.05.

Instead of using a p-value < 0.05 from a t-test there are several other things you could try: * The T-test is not generally suitable for proteomics data, which is count based (if its MS-MS anyway). You want to try one of the negative binomial based packages, like edgeR, DESeq or limma-voom that are usually used for RNAseq. This will probably give you better FDRs * Failing that you could try an FDR threshold of 0.2 or 0.3 - yes 20%-30% of your hits will be false positives, but that is less than would be the case using a pvalue.

Bearing that in mind, there is no "correct" way to choose a log Fold Change threshold. Log Fold Change thresholds are used to find genes/proteins where the change is big enough to be biologically interesting - they are determining the biological merit, not statistical merit and are thus subjective.

However, it is true that larger log fold changes, particularly at higher expression levels, are more likely to be real than smaller ones, but you can't put a threshold and say "above this threshold is good, below this threshold is bad".

Proper use of the fold change threshold: The correct way to use a fold change threshold, in the absence of meaningful FDR thresholds, would be to not rely on hard thresholds in your downstream analyses. There are many biologically interesting questions you can ask using analyses that rely on ranking proteins, rather than dividing them into two categories (different and not different).

Finally, to put it bluntly, any reviewer than will accept log fold change threshold over an FDR one won't have anything useful to say about the position of that threshold anyway.

Entering edit mode

Thank you very much,

However, we should consider that we are using samples which are genetically different and therefore we get high varibility and it is hard to get good FDR:

  • If we select FDR=25% there are only 2 significantly DE proteins.
  • If we select FDR=40% there are only 50 significantly DE proteins.
  • Most of the proteins are distributed around FDR=80%.
  • The top 3% (with best FDR) of the proteins have FDR<46%**

Considering our list of DE genes seen make biological sense, it will be a shame to discard it because there is no good FDR. To the best of my understandaing the work I have cited above, also faces the same issue (using samples from different humans individuals), and did not use FDR (published in Journal of Proteomics).

Entering edit mode

Well, i'd start at the position that TTEST is the wrong statistical test for proteomics data.

You can have a good enrichment for biologically relevant genes and still most of the genes be wrong. Take the example above. If you have 500 DE genes out of 8,000, you will have an FDR of 80%. That means 80% of your hits will be false positives. But if 20% are true, thats more than enough to show an enrichment if those 20% are relevant.

If the top 3% of proteins have an FDR of 46%, then there are two possible explainations: either 46% of them are false positives, OR the FDR is incorrectly calculated, because that is what FDR is - the fraction of your results that are expected to be wrong.

This problem is faced all the time in high-through put biology: take GWAS, we can show that with a thershold of 10^-8 we find only a tiny fraction of the relevant genes, and that at 0.05 we find most or all of the relevant genes, but we also find thousands of genes that are wrong.

I wouldn't put too much store by what is published. Just because something is published, doesn't mean it is right.

So if you can't find a better way to calculate FDR, I'd argue that it is better to analyse your results through a framework other than DE, rather than do DE badly.


Login before adding your answer.

Traffic: 2174 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6