From a biostars question about interleaving vectors...
For loops can (but don't have to) be really slow in R, so I wanted to compare answers provided to this question to compare a straightforward for loop approach, to various work arounds to speed the process up.
f_for <- function(a,b){
res <- c()
for (i in 1:length(a)){
res <- c(res, a[i], b[i])
}
res
}
f_preallocate <- function(a,b) {
n <- length(a) + length(b)
res <- character(n)
res[seq(1,n-1,2)] <- a
res[seq(2,n,2)] <- b
res
}
f_rbind <- function(a,b) as.character(rbind(a,b))
f_ord <- function(a,b){
ord<- order(c(1:length(a), 1:length(b)))
res<- c(a, b)[ord]
res
}
fxns <- c("f_for", "f_preallocate", "f_rbind", "f_ord")
a <- c("1.TY","2.TY","4.TY","5.TY", "0.TY")
b <- c("1.MN","2.MN","4.MN","5.MN", "0.MN")
for(fname in fxns){
cat(fname, ":\n\t")
print(system.time( replicate(10000, do.call(fname, list(a=a, b=b)))))
}
## f_for :
## user system elapsed
## 0.33 0.00 0.33
## f_preallocate :
## user system elapsed
## 1.64 0.03 1.67
## f_rbind :
## user system elapsed
## 0.15 0.00 0.16
## f_ord :
## user system elapsed
## 0.48 0.00 0.48
With this amount of data the for loop doesn't do too badly, and even beats the pre-allocation approach (I guess because of all the futzing to get lengths/indicies)
The main cocern with the for-loop approach is how constantly re-allocating memory for the vector slows the process down. So, let's see hwat happens when the to-be-interlaeaved vectors are 4000 elements long:
for(fname in fxns){
cat(fname, ":\n\t")
print(system.time( replicate(100, do.call(fname, list(a=rep(a,1000), b=rep(b, 1000))))))
}
## f_for :
## user system elapsed
## 29.58 0.01 29.72
## f_preallocate :
## user system elapsed
## 0.23 0.00 0.24
## f_rbind :
## user system elapsed
## 0.11 0.00 0.11
## f_ord :
## user system elapsed
## 0.19 0.00 0.19
Thank you, got what I need.