Hiya,
I am looking at using Cox Hazard in R using survivor, I have either (mo) 1=died, 0=censored, time in hours (hour), the temperature treatment (temp_c, 3 groups), condition factor (CF, continuous var).
I am confused about handling the temperature group variable and have trialled 2 methods:
data <- read.delim("coxnew.txt", header=TRUE)
data$SurvObj <- with(data, Surv(hour, mo == 1))
(i) I have seen that in example models online, sex: m=0, f=1 or similar, so I used L=1,M=2,H=3 as the groups.
mod <- coxph(SurvObj ~ temp_c + CF , data = data)
and this gave me a summary with two outputs, one for temp_c and one for CF.
(ii) I have also made temperature treatment a factor
data$temp_c <- factor(data$temp_c)
data$temp_c<- relevel(data$temp_c, ref="H")
mod <- coxph(SurvObj ~ temp_c +CF , data = data)
and this gave me a summary with three outputs, one for temp_cL, one for temp_cM and one for CF
I am not sure which is the correct to use, as (ii) requires that you then input one group as a reference? The confusion comes into play when I try and see what temperature plots like, when CF is held at a mean value, as the output graphs look different for the two different methods?
(i)
temp_new <- with(data, data.frame(temp_c= c(1,2,3), CF = rep(mean(CF, na.rm = TRUE), 3)))
or (ii)
temp_new <- with(data, data.frame(temp_c= c("L","M","H"), CF = rep(mean(CF, na.rm = TRUE), 3)))
Does it matter which I use- is it personal preference - or does one version make more statistical sense? I was inclined to go with the first (i) as this compared to the m=0, f=1 style and I assume uses a comparison of three groups among themselves and not just two groups compared to the reference level assigned?
Many thanks, Bekah
Cheers! Okay, hopefully I am correct in interpreting your answer as: If its set as
data$SurvObj <- with(data, Surv(hour, mo == 1))
A hazard ratio of 0.7 for temperature grouping as simply 1,2,3 would be with increasing group number, chance of death (1) over censoring (0) decreases from 1 --> 2 --> 3 temperature treatment.
A hazard ratio of L (5) and M (3) for temperature grouping as factors, with ref level as H would be: Increase in chance of death (1) over censoring (0) for both L and M when compared to H, but chance with temperature L is higher?
Best wishes, Bekah
Yes, 0.7 indicates that, with increasing
temp_cvalue, HR is reduced, when adjusted for your condition factor (CF). I am not sure of the exact interpretation of having temperature encoded as a continuous variable of 1, 2, 3 - it would make more sense to be categorical. A continuous temperature variable makes more sense as Kelvin values, or, granted, Celsius.The other values for
LandMare readily interpreted. Considering that you setHas the reference level, it says that the low temperature group has the highest hazard of death, when adjusted forCF. The medium temperature group also has a higher hazard of death (i.e., higher hazard when compared to high temperature group).You should also be looking at the upper and lower confidence intervals (CIs), and the Log Rank p-value. For example, a general rule of thumb: if we have a HR=0.7 but it's upper CI passes 1.0, then that is less reliable and this will reflect in the p-value.
Thank you so much for all your help! :) this is much clearer now!