Question

Filtering proteomics matrix before diff. exp.

0

Entering edit mode

4.9 years ago

rpraveenkumarbio ▴ 10

Hi, I would like experts feedback on a thought/question. Can we filter all the non-differentially expressed proteins before performing differential expression in a proteomics dataset? Let me write in detail of what I trying to achieve.

We have a mass spec global proteomics data of ~5000 for 15 cases. 9 cases in cond-A and 6 in cond-B. Using the log2 normalized data I performed differential expression with limma lmFit. I had some ~250 proteins with p-value < 0.05 but the FDR is high because of p-value distribution for most of the proteins (tests). Hence I decided to filter the input matrix to keep only those 250 proteins (that had p-value of < 0.05) and performed the differential expression test again and ended up with ~150 proteins with adjusted p-value < 0.05.

 So my simple question would be is it technically right to filter all the non-differentially expressed proteins before performing differential expression in a proteomics dataset?

limma edgeR DESeq2 • 902 views

ADD COMMENT • link updated 4.9 years ago by JC 13k • written 4.9 years ago by rpraveenkumarbio ▴ 10

score 1 · Answer 1 · 2019-06-21

Strictly talking you are "adjusting" your inputs to have a better selection, that is not representing what is happening in your samples. Could be better to explore is some samples are too different or if some proteins as almost-zero values to remove them and reduce variability between samples.