How they get such a low adjusted P value in differential analysis of proteomics data?
0
0
Entering edit mode
2.1 years ago
Xiaokang ZH ▴ 60

I'm new to proteomics data and my experience with it is quite different from transcriptomics data (RNA-Seq). I got some protein abundance data of samples from control group and exposure group (in vivo experiment exposing fish to toxicant, 10 samples in each group). The purpose is to find out the differentially expressed proteins. I used the package DEP to do the preprocessing and statistical analysis. In the end, the adjusted P values (p-adj) are very high (either equals 1 or close to 1) and the fold change is also almost 1. So if I use p-adj then no protein is differentially expressed. I read some papers about proteomics data (the ones citing that package) and they report to get very low p-adj (0.05 is used as threshold). I suspect that I didn't do the analysis in the correct way but I couldn't find out the problem...

The steps I've done to the raw abundance data are: remove the proteins who have more than half missing values in any group, transform the raw abundance with arcsin, impute the missing values using maximum likelihood estimation, use the function test_diff from DEP to do differential analysis.

proteomics differential analysis DEP P value FDR • 1.2k views
1
Entering edit mode

Dear @Xiaokang ZH

As you said you are new to proteomic data analysis I would like to suggest some free softwares that are easier to start working with proteomics data.

First one is the MaxQuant and the other is the Perseus. Maxquant is for analyzing raw data, after which you import it into perseus and you can do a lot of analysis, like LFQ.

A very interesting point is that Max Planck Institute offers a series of summer school tutorial videos.

MaxQuant Summer School 2019 Madison

If you already know this software, sorry for my answer.

Best Regards,

Leite

0
Entering edit mode

Thank you Leite. The data I have is already processed abundance. I'll check out Perseus.

0
Entering edit mode

I am guessing your groups have a large variation, you should try to find outliers with a PCA/MDS.

0
Entering edit mode

Thank you! That's a good point. I did find 3 outliers with PCA and after removing them, the p-adj became more normal (I use to have one comparison of two groups with all p-adj = 1 and now they look normal. But still, they are still very high. A quick glance at the lowest ones: 0.00827, 0.0611, 0.219, 0.265, and their corresponding Fold Change are: 1.0514644, 0.9507250, 0.9721831, 1.0139595