Question: COMPLEAT analysis - Score calculation
0
Assa Yeroslaviz1.4k wrote:

Hi all,

I am nor sure if anyone has ever worked with the compleat web tool. It is a tool for protein complex enrichments.

I am trying to understand how the score for the protein complexes in this tool is calculated.
For that I have taken one example file `HumanRNAiDNARepairScreen.txt` and uploaded it to COMPLEAT without changing the parameters.

I have than taken a closer look at the complex SLIK (SAGA-like) complex.
According to the tool it has a score of `0.02553 (=2.553e-02)`

I have tried to calculate this same score by using the scores for the protein complex members, I have taken from the output of this single protein (s. table below).

The formulate for calculating the complex score is given in your paper as

`CIQM = (1 / ((Q3 - Q1) +1) * SUM(Xi from i=Q1 to i=Q3).`

This is the vector I am using for the calculations, already sorted in decreasing order.

`SLIK <- c(1.68, 1.6, 1.56, 1.47, 1.13, 1.12, 1.03, 0.94, 0.91, 0.59, 0.51, 0.5, 0.49, 0.46, 0.42, 0.31, 0.001, 0, 0, -0.005, -0.1, -0.2, -0.27, -0.45, -0.8, -0.88, -1.09)`

As you can see, there are 27 members (=n).
If I calculate Q1 and Q3 according to your paper I get 7 and 20 fro Q1 and Q3 respectively.

```Q1 = (27/4) + 1 = 7.75 -> integer is 7
Q3 = (3*27)/4 = 20.25 -> integer is 20```

When I than calculate the IQM with the above formula I get a different value than the one you show on the screen.
the sum of all the vector values between position 7 and position 20 is:
sum(SLIK(7:20)) = sum(0.94, 0.91, 0.59, 0.51, 0.5, 0.49, 0.46, 0.42, 0.31, 0.001, 0, 0, -0.005, -0.1) = 5.026

so the IQM can be calculated like that:

`IQM = ( 1 / ((20-7) +1 )* 5.026 = 0.359`

So I have a discrepancy between my calculated IQM and yours.

I was wondering what I am doing wrong in this calculation. Am I taking the wrong quartiles?

Do I need to take the real quartiles
The quartile of SLIK can be calculated as such:

```quantile(SLIK)
0%     25%     50%     75%    100%
-1.0900 -0.0525  0.4600  0.9850  1.6800```

Should `Q1` and `Q3` be `0.0525` and` 0.9850`?
But how do I add them to the formula in the sum?

Do I need to remove all proteins from the vector, which have a 0 value?

I appreciate all the help in advance and hope you can help me solve this problem.

cu,
Assa

SILK complex members:

```Symbol    ID    Name     Score
TAF10    6881     TAF10    -0.8
TAF5    6877     TAF5    -0.005
KAT2A    2648     KAT2A    0.46
KAT2B    8850     KAT2B    -1.09
BPTF    2186     BPTF    0.49
USP51    158880     USP51    0
USP22    23326     USP22    1.12
TAF6L    10629     TAF6L    -0.88
CHD1    1105     CHD1    1.68
TAF9B    51616     TAF9B    -0.2
SUPT3H    8464     SUPT3H    1.03
ATXN7    6314     ATXN7    0.42
TRRAP    8295     TRRAP    0.94
USP3    9960     USP3    0.31
CHD3    1107     CHD3    0.91
CECR2    27443     CECR2    1.56
LAPTM5    7805     LAPTM5    0.001
CHD2    1106     CHD2    1.13
CHD4    1108     CHD4    0
TAF5L    27097     TAF5L    1.47
TAF12    6883     TAF12    -0.1
TAF9    6880     TAF9    0.5
TAF6    6878     TAF6    -0.45