STRING combined score: a bug or else
6.0 years ago
yliueagle ▴ 230

I am using the STRING protein interaction database. I have problem of how the combined score of an interaction is calculated. In this reference http://nar.oxfordjournals.org/content/33/suppl_1/D433.long, it stated the combined score S=1-(1-S_1)*...*(1-S_8), where S_i is the score from the ith evidence. However, this result http://string-db.org/newstring_userdata/tabdelimited.P5Gb87rP_Ayk.txt shows the discrepancy, that two interactions have identical individual scores but different combined scores (though only slightly different)

HRAS	RASA1	1849235	1846381	ENSP00000309845	ENSP00000274376	0.000	0.000	0.000	0.000 0.000	0.000	0.900	0.000	0.899
DOK2	IL2RB	1846493	1842567	ENSP00000276420	ENSP00000216223	0.000	0.000	0.000	0.000 0.000	0.000	0.900	0.000	0.900

Another two examples showed great discrepancy between the observed combined score and that of the computed by the mentioned formular http://string-db.org/newstring_userdata/tabdelimited.ThlUvDQgUN9n.txt

IKZF2    IKZF1    1860848    1851367    ENSP00000410447    ENSP00000331614    0.000    0.000    0.000    0.950    0.000    0.576    0.000    0.574    0.587

ZPBP2    ZPBP    1851843    1842216    ENSP00000335384    ENSP00000046087    0.000    0.000    0.000    0.851    0.000    0.000    0.360    0.890    0.444

Is this a bug or there is a different formula to calculate combined score from individual scores?

STRING protein interaction