Easier way to add another column if values in another fall within a range in R
1
0
Entering edit mode
7 weeks ago
unawaaz • 0

Hi everyone,

I have a dataset and I want to add another column where I want to label based on the range.

My df is essentially Ages and I want to have another column that is Age Interval. My code works when my ages are over 0 but not sure how to do this when I have negative values (pcw). To achieve what I want, this is what I'm currently doing:

df$AgeInterval[df$Age >= -0.701 & df$Age <= -0.62] = "4-7pcw"
df$AgeInterval[df$Age >= -0.621 & df$Age <= -0.58] = "8-9pcw"
df$AgeInterval[df$Age >= -0.57 & df$Age <= -0.53] = "10-12pcw"

But there has to be a simpler way to do this?

Normally to do this I would use this for values over 0 and it gets the job done:

df %<>% mutate(age_interval = as.character(cut(Age, seq(-1, 100, by = 10)))) %>% 
  mutate(AgeInterval = sapply(age_interval, function(i) {
    paste0(
      as.numeric(gsub("^\\(([-0-9]+),.+", "\\1", i)) + 1,
      "-",
      as.numeric(gsub(".+,([0-9]+)\\]$", "\\1", i)), "yrs"
    )})) %>% 
  dplyr::select(-age_interval)
R • 665 views
ADD COMMENT
0
Entering edit mode

How many categories do you have ?

ADD REPLY
0
Entering edit mode

13 in total

ADD REPLY
0
Entering edit mode

You could use case_when and between within mutate. ie:

df %>% mutate(AgeInterval = case_when(between(age, -0.701, -0.62) ~ "4-7pcw",
                                      between(age, -0.621, -0.58) ~ "8-9pcw")
ADD REPLY
0
Entering edit mode

strongly consider reading the docs or working thru the vignettes for for data.table

ADD REPLY
2
Entering edit mode
7 weeks ago
zx8754 11k

Read about cut and try this example:

set.seed(1); x <- runif(100, -1, 1)

xCut <- cut(x, breaks = seq(-1, 1, 0.5), labels = paste0("age", 1:4))

head(xCut)
# [1] age2 age2 age3 age4 age1 age4
# Levels: age1 age2 age3 age4

table(xCut)
# xCut
# age1 age2 age3 age4 
#   20   32   21   27 
ADD COMMENT
0
Entering edit mode

What do I do if the break is not 0.5?

ADD REPLY
0
Entering edit mode

You can put any breaks you need, for example, this will give you 2 groups -1 to 0 and 0 to 1: breaks = c(-1, 0, 1)

ADD REPLY

Login before adding your answer.

Traffic: 728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6