Hello, here is an EMT score method
The empirical cumulative distribution function (ECDF) was estimated for Epi and Mes gene sets. The 2KS test was employed to compute the difference between Mes ECDF (ECDFMes) and Epi ECDF (ECDFEpi). The 2KS score was then taken as the EMT score. A sample with a positive EMT score exhibits a more Mes phenotype, whereas a negative EMT score reflects a more Epi phenotype. Note that the 2KS test allows segregation of samples into Epi (2KS score ECDFEpi > ECDFMes; P < 0.05), intermediate Epi (2KS score ECDFEpi > ECDFMes; P ≥ 0.05), intermediate Mes (2KS score ECDFEpi < ECDFMes, P ≥ 0.05) and Mes (2KS score ECDFEpi < ECDFMes, P < 0.05).
I want to reproduce with R. My understanding is no need to use
ecdf function, just use
ks.test to determine distribution function relationship between Epi and Mes gene set.
My problem is can't get negative 2ks score in R. For example you can't get negative
D value as code shows:
> m <- rnorm(200, mean = 1) > n <- rnorm(200, mean = 2) > plot(ecdf(m), col = "green") > plot(ecdf(n), col = "red", add = TRUE) > ks.test(m, n, alternative = "greater") Two-sample Kolmogorov-Smirnov test data: m and n D^+ = 0.46, p-value < 2.2e-16 alternative hypothesis: the CDF of x lies above that of y > ks.test(m, n, alternative = "less") Two-sample Kolmogorov-Smirnov test data: m and n D^- = 0.005, p-value = 0.995 alternative hypothesis: the CDF of x lies below that of y
Tan T Z, Miow Q H, Miki Y, et al. Epithelial‐mesenchymal transition spectrum quantification and its efficacy in deciphering survival and drug responses of cancer patients[J]. EMBO molecular medicine, 2014, 6(10): 1279-1293.