Question: Calculate The Weighted Mean Of Matrix In R: Fastest Way
1
10.0 years ago by
Sirus790
Boston/USA
Sirus790 wrote:

Hello everybody, I want to create a simple R script that calculates the center of a cluster, I X={X1,X2,...,X2} is my data matrix and U={U_ik} is the partnership of element Xi in cluster K, I want to do the following in a fastest way:

so here each Xi is multiplied by a weight and summed up and then divided by the sum of weights , so it is weighted mean, but here our X is a vector. There in R the function weighted.mean(), but it needs only numerical values. I have written a script well it works for one but for a big number of X it is slow the code is

``````U : membership matrix.
X : Our data.
m : parameter
K : cluster number K

function(U,X,m,k)
{
Nominator <- matrix(0,length(X[,1]),1)
Denominator <-0

for(i in 1:length(U[1,])) #We go trought the elements of the cluster k
{
if(U[k,i]!=0)
{
Nominator <- Nominator+ (U[k,i]^m)* X[,i]
Denominator <- Denominator + (U[k,i]^m)
}
}
Nominator/Denominator
}
``````
R matrix statistics • 8.1k views
modified 7.0 years ago by Biostar ♦♦ 20 • written 10.0 years ago by Sirus790

btw, if all your U matrix elements were only 0 or 1, why do you take to the power of m then?

You are doing it wrong: just figured that your code doesn't do what your formula says. you have to take X^m not U^m !

Here X values may be 0 or 1 but U values are real values. In fact they represent the membership degree of element Xj in cluster Ci

5
10.0 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

``````foo2 <- function(U,X,m,k)  X %*% U[k,]^m / sum(U[k,]^m)
``````

plus make some sample data:

``````U <- matrix(sample(c(0,1), size=10*10, replace=TRUE), nrow=10, ncol=10)
X <- matrix(rnorm(100), nrow=10, ncol=10)
``````

let's try it out:

``````foo1 <- your function
>all (foo2(U,X,2,1) == foo1(U,X,2,1))
[1] TRUE
``````

It will be about x-times faster for large data

Nice work converting C into R. I didn't know about all(), that's a sweet shortcut for confirming your solution.

Waw, it seems fast, I will try it now , thank you for you precious help. my data is big

I have tried your methode with this sample if I do all (foo2(U,X,2,1) == foo1(U,X,2,1)) it is TRUE but all (foo2(X,U,2,1) == foo1(X,U,2,1)) it is is FALSE I think it needs only a small change, I will try to figure it out. And really thank you for this help, I am happy to see the equivalent code in one line.

Of course the result will be different if you exchange the parameters ;) The vector to matrix multiplication %% is sensitive to the order of arguments such that : x %% A = A %% t(x) (with x a row vector and t() transposition function, if I my remember linear algebra classes rule of thumb row times column* ;)

Yeah, that's true but I mean that normally the results of foo2 and foo1 must the same what ever is the order of the parameters. foo2(X,U,m,k) should be equal to foo1(X,U,m,k) because normally we are applying the same formula so we should have the same results.

no, because R is not strongly typed. for example you can matrix multiply a vector x with a matrix M: x %% M if you exchange order M %% x then the result is different. So I can define a totally valid function given x is a vector and M a matrix, if that is inverted the result can be different

0
10.0 years ago by
Sirus790
Boston/USA
Sirus790 wrote:

Thank you very much Michael Dondrup,

I have figured out the solution, inspired from your code bellow it is

``````foo1<-function(U,X,m,k)(t(t(X) * U[k,]^m) %*% c(rep(1,length(U[k,]))))/sum(U[k,]^m)
``````

For the time of execution, the first one that I have wrote is fast in case the U matrix has a lot of 0's.