Question

R programming question: insert alternately

0

Entering edit mode

10.2 years ago

MAPK ★ 2.1k

Hi Guys,

I have a quick question:

I have list of characters in two different objects

object1

 "1.TY"   "2.TY"   "4.TY"   "5.TY"

object2

"1.MN"   "2.MN"   "4.MN"   "5.MN"

I want to merge them alternating one after other, as shown below.

Result

"1.TY"  "1.MN"  "2.TY" "2.MN"  "4.TY" "4.MN"  "5.TY"  "5.MN"

Thank you for your help.

R • 3.0k views

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 10.2 years ago by MAPK ★ 2.1k

1

Entering edit mode

10.2 years ago

TriS ★ 4.8k

a <- c("1.TY","2.TY","4.TY","5.TY")
b <- c("1.MN","2.MN","4.MN","5.MN")
results <- c()
for (i in 1:4){
  results <- c(results, a[i], b[i])
}
> results
[1] "1.TY" "1.MN" "2.TY" "2.MN" "4.TY" "4.MN" "5.TY" "5.MN"

Enjoy :)

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 10.2 years ago by TriS ★ 4.8k

1

Entering edit mode

This should get the job done, and for shorter vectors won't take too long. But be aware, starting an empty vector and growing it like this, though sensible in many languages, is usually very slow in r. I've not tested it, but as.character(rbind(a,b)) should work, and starting a vector of the final size and filling it by index (using seq() ) would likely be faster.

ADD REPLY • link 10.2 years ago by David W 4.9k

0

Entering edit mode

Thank you, got what I need.

ADD REPLY • link 10.2 years ago by MAPK ★ 2.1k

1

Entering edit mode

10.2 years ago

dariober 15k

This solution might be preferable in R as it should be orders of magnitude faster than a for-loop (although in practice it might not matter):

a <- c("1.TY","2.TY","4.TY","5.TY", "0.TY")
b <- c("1.MN","2.MN","4.MN","5.MN", "0.MN")
ord<- order(c(1:length(a), 1:length(b)))
results<- c(a, b)[ord]
> results
 [1] "1.TY" "1.MN" "2.TY" "2.MN" "4.TY" "4.MN" "5.TY" "5.MN" "0.TY" "0.MN"

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 10.2 years ago by dariober 15k

0

Entering edit mode

10.2 years ago

Irsan ★ 7.8k

Use the interleave-function from ggplot2-package

ADD COMMENT • link 10.2 years ago by Irsan ★ 7.8k

Ram · Accepted Answer · 2015-04-22

For anyone else interested in the different performance of these solutions, I did a crude comparison.

The TLDR would be: the "grow a for loop" approach works for short vectors, but scales terribly. The rbind method, as unintuitive as it might be, is the fastest of the lot

From a biostars question about interleaving vectors...

For loops can (but don't have to) be really slow in R, so I wanted to compare answers provided to this question to compare a straightforward for loop approach, to various work arounds to speed the process up.

First, the functions

f_for <- function(a,b){
  res <- c()
  for (i in 1:length(a)){
    res <- c(res, a[i], b[i])
  }
  res
}
  
f_preallocate <- function(a,b) {
  n <- length(a) + length(b)
  res <- character(n)
  res[seq(1,n-1,2)] <- a
  res[seq(2,n,2)] <- b
  res
}

f_rbind <- function(a,b) as.character(rbind(a,b))

f_ord <- function(a,b){
  ord<- order(c(1:length(a), 1:length(b)))
  res<- c(a, b)[ord]
  res
}

fxns <- c("f_for", "f_preallocate", "f_rbind", "f_ord")

Now, running them on the the (small) example data

a <- c("1.TY","2.TY","4.TY","5.TY", "0.TY")
b <- c("1.MN","2.MN","4.MN","5.MN", "0.MN")

for(fname in fxns){
  cat(fname, ":\n\t")
  print(system.time( replicate(10000, do.call(fname, list(a=a, b=b)))))
}

## f_for :
##     user  system elapsed 
##    0.33    0.00    0.33 
## f_preallocate :
##     user  system elapsed 
##    1.64    0.03    1.67 
## f_rbind :
##     user  system elapsed 
##    0.15    0.00    0.16 
## f_ord :
##     user  system elapsed 
##    0.48    0.00    0.48

With this amount of data the for loop doesn't do too badly, and even beats the pre-allocation approach (I guess because of all the futzing to get lengths/indicies)

But how does it scale?

The main cocern with the for-loop approach is how constantly re-allocating memory for the vector slows the process down. So, let's see hwat happens when the to-be-interlaeaved vectors are 4000 elements long:

for(fname in fxns){
  cat(fname, ":\n\t")
  print(system.time( replicate(100, do.call(fname, list(a=rep(a,1000), b=rep(b, 1000))))))
}

## f_for :
##     user  system elapsed 
##   29.58    0.01   29.72 
## f_preallocate :
##     user  system elapsed 
##    0.23    0.00    0.24 
## f_rbind :
##     user  system elapsed 
##    0.11    0.00    0.11 
## f_ord :
##     user  system elapsed 
##    0.19    0.00    0.19

view raw riffle.md hosted with ❤ by GitHub