Question: New Column based on another data frame
0
gravatar for j.lunger18
3 months ago by
j.lunger1810
j.lunger1810 wrote:

Hi, I am trying to apply this problem to a data frame with variants so that I can say within which domain each variant is found.

> ranges
  start end domain_name
1     1   3   beginning
2     4   6     middle1
3     7   8     middle2
4     9  11         end

> positions
   ID position
1   a        0
2   b        1
3   c        2
4   d        3
5   e        4
6   f        5
7   g        6
8   h        7
9   i        8
10  j        9
11  k       10
12  l       11
13  m       12
14  n       13

I want to add a column to "positions", which will tell me which domain (and there could be multiple for a single variant...) each position is found in. Thanks!

domains R genome • 169 views
ADD COMMENTlink modified 3 months ago by Brice Sarver3.5k • written 3 months ago by j.lunger1810
0
gravatar for Brice Sarver
3 months ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

Something like this will work. Assumes non-overlapping domains and no special R packages. Also casting to numerics to avoid any character conflicts. ranges must be global.

locate_domain <- function(position) {
 for (i in 1:nrow(ranges)) {
  r <- c(as.numeric(ranges[i, 1]):as.numeric(ranges[i, 2]))
  if (position %in% r) {
    return(ranges[i, 3])
  }
 }
}

positions <- cbind(positions, domain = sapply(as.numeric(positions$position), locate_domain)

This will search for a given position in a range of positions calculated on-the-fly in the data.frame and return the domain, then cbind it to the positions data.frame. Alternatively, you could pre-compute the ranges and store in a list named by the domain and return the name, compute the range on the fly, etc.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Brice Sarver3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 770 users visited in the last hour