Question: Significance testing on regression with binned data
gravatar for selplat21
2 days ago by
selplat2110 wrote:

I have binned the following data by year, but am wondering how I can assess significance of the resulting regression because the data are binned. I would like to do this regression for multiple traits separately against time binned in years.

Data$cuts <- cut(Data$year, breaks = c(seq(min(Data$year), max(Data$year), 20), max(Data$year)), labels = FALSE)

DataM <- Data[Data$sex=="M",]
DataF <- Data[Data$sex=="F",]

mean.df <-

for (i in 2:8) {
  Mcuts <- DataM[which(DataM$cuts==i),]
  Fcuts <- DataF[which(DataF$cuts==i),]
  Mmean <- mean(Mcuts$trait, na.rm = TRUE)
  Fmean <- mean(Fcuts$trait, na.rm = TRUE)
  mean.df[i, "bin"] <- paste(i)
  mean.df[i, "mean_dif"] <- paste(Mmean-Fmean)
  mean.df[i, "ss_f"] <- paste(length(Mcuts$cuts))
  mean.df[i, "ss_m"] <- paste(length(Fcuts$cuts))
  mean.df[i, "ss_t"] <- paste(sum(length(Fcuts$cuts),length(Mcuts$cuts)))

lm1 <- lm(mean_dif ~ bin, data=mean.df)
plot(mean.df$bin, mean.df$mean_dif)
R • 49 views
ADD COMMENTlink written 2 days ago by selplat2110

I don't really understand why you go through all of this process. couldn't lm( trait ~ year + sex + year:sex ) tell you what you want? Anyway since you're taking the means and subtract them for each bin you end up with 7 values for 7 different levels, bin is not an integer.

ADD REPLYlink written 2 days ago by Asaf8.0k

Yes, i've done this and yes it does tell me that there's an effect, but I am testing additional hypotheses following up on this effect.

I was able to fix with the following, which provides a p-value:

lm1 <- lm(mean_dif ~ as.numeric(mean.df$bin), data=mean.df)
plot(mean.df$bin, mean.df$mean_dif)

However, is this p-value usable since the data are binned? Additionally, some bins have few to no data, do I exclude these or do I need confidence intervals, etc.?

ADD REPLYlink modified 1 day ago • written 1 day ago by selplat2110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1270 users visited in the last hour