Question: R Error - lm- object of type closure is not subsettable
1
gravatar for amandinelecerfdefer
4 months ago by
amandinelecerfdefer20 wrote:

Hello I'm trying to automate on a set of files, the realization of linear regression (lm) and the corresponding graph. example for 1 file :

score;EGFR_12;EGFR_24;EGFR_36;EGFR_48;EGFR_60;pa
-0,5992442;67,56217938;53,61312383;52,93430604;;;1
-0,6795702;53,28459074;57,23583761;43,94840102;51,36407098;;2

code R :

if (!require(devtools)) {
  install.packages("devtools")
  library(devtools)
}
install_github("larmarange/JLutils")

library(ggplot2)
library(JLutils)

ggplotRegression <- function (fit) {

  require(ggplot2)

  ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
    geom_point() +
    stat_smooth(method = "lm", col = "red") +
    labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                       "Intercept =",signif(fit$coef[[1]],5 ),
                       " Slope =",signif(fit$coef[[2]], 5),
                       " P =",signif(summary(fit)$coef[2,4], 5)))
}


setwd("/Users/Desktop/global")
files <- list.files(path = "data", pattern = (".csv$"))

for (k in 1:length(files)) {
  fname <- files[k]
  cat(paste0("Now analyse data/", fname, "...\n"))
  data <- read.csv2(paste0("data/", fname), header = T, stringsAsFactors = F, dec = ",")
  x<-summary(lm(data$EGFR_12 ~ data$score))$coefficients
  write.csv(x, file = paste0("summary/summary_egfr12", fname))


  p1<- ggplotRegression(lm(data$EGFR_12 ~ data$score))
  rm(data, x)
  Sys.sleep(5)


  y<-summary(lm(data$EGFR_24 ~ data$score))$coefficients
  write.csv(y, file = paste0("summary/summary_egfr24", fname))
  p2<- ggplotRegression(lm(data$EGFR_24 ~ data$score))

  rm(data, y)
  Sys.sleep(5)


  z<-summary(lm(data$EGFR_36 ~ data$score))$coefficients
  write.csv(z, file = paste0("summary/summary_egfr36", fname))
  p3<- ggplotRegression(lm(data$EGFR_36 ~ data$score))

  rm(data, z)
  Sys.sleep(5)


  a<-summary(lm(data$EGFR_48 ~ data$score))$coefficients
  write.csv(a, file = paste0("summary/summary_egfr48", fname))
  p4<- ggplotRegression(lm(data$EGFR_48 ~ data$score))

  rm(data, a)
  Sys.sleep(5)


  b<-summary(lm(data$EGFR_60 ~ data$score))$coefficients
  write.csv(b, file = paste0("summary/summary_egfr60", fname))
  p5 <- ggplotRegression(lm(data$EGFR_60 ~ data$score))

  rm(data, b)
  Sys.sleep(5)

  png(file = paste0("graphe_png/", fname), width = 350, height = "350")
  multiplot(p1, p2, p3, p4, p5, cols = 5)
  dev.off()

  pdf(file = paste0("graphe_pdf/", fname))
  multiplot(p1, p2, p3, p4, p5, cols = 5)
  dev.off()

}

So I tried to create a loop that for each file in my data folder, creates an lm and then a graph by calling the ggplotRegression function.

error :

Now analyse data/commun_GLOBALalloscore_norm-imput-avechla.csv...
Error in data$EGFR_24 : object of type 'closure' is not subsettable

How to solve this error?

Thank you in advance

R • 213 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by amandinelecerfdefer20

what I'm trying to do: I have a set of csv files in a directory for each file, I want to do a linear regression for EGFR_12 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_24 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_36 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_48 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_60 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value). Then make a multiplot of all the previously saved plots. But despite my research, I can't do what I want.

ADD REPLYlink modified 4 months ago • written 4 months ago by amandinelecerfdefer20

See my answer. The issue is that you delete the value of data even though your code (the loop) needs it all the time.

ADD REPLYlink written 4 months ago by ATpoint26k

despite spending time rewriting the script thanks to your answers, I still can't do it.

I also try :

setwd("/Users/amandinelecerfdefer/Desktop/global")
library(ggplot2)
ggplotRegression <- function (fit) {

  require(ggplot2)

  ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
    geom_point() +
    stat_smooth(method = "lm", col = "red") +
    labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                       "Intercept =",signif(fit$coef[[1]],5 ),
                       " Slope =",signif(fit$coef[[2]], 5),
                       " P =",signif(summary(fit)$coef[2,4], 5)))
}
files <- list.files(path = "data", pattern = (".csv$"))

for (k in 1:length(files)) {
  fname <- files[k]
  cat(paste0("Now analyse data/", fname, "...\n"))
  data <- read.csv2(paste0("data/", fname), header = T, stringsAsFactors = F, dec = ",")
  head(data)


  fit1 <- lm(EGFR_12 ~ score, data = data)
  x<-summary(fit1)$coefficients
  write.csv(x, file = paste0("summary/summary_egfr12", fname))
  p1<-ggplotRegression(fit1)
  fit2 <- lm(EGFR_24 ~ score, data = data)
  x<-summary(fit1)$coefficients
  write.csv(y, file = paste0("summary/summary_egfr24", fname))

  p2<- ggplotRegression(fit2)
  fit3 <- lm(EGFR_36 ~ score, data = data)
  p3<- ggplotRegression(fit3)
  fit4 <- lm(EGFR_48 ~ score, data = data)
  p4<- ggplotRegression(fit4)
  fit5 <- lm(EGFR_60 ~ score, data = data)
  p5<- ggplotRegression(fit5)
}

But I can't do multiplot in a pdf or a png

ADD REPLYlink modified 4 months ago • written 4 months ago by amandinelecerfdefer20
1

Please stop using the answer field to add details. Use ADD REPLY. Can't to is most uninformative. What is the problem. There is no point in adding so much code. Go through your code step by step without running the loop. Do it sequentially and find the part that is causing trouble. Set k <- 1 and then execute the command one after another. Check where the problem is, then try to explain it here without adding a lot of code. Try to focus on the essential problem.

ADD REPLYlink written 4 months ago by ATpoint26k
2
gravatar for ATpoint
4 months ago by
ATpoint26k
Germany
ATpoint26k wrote:

Your code is highly redundant. The operations summary/write.csv/ggplotRegression appear multiple times. It would be smarter to write a single function which contains this basic workflow and then use loops or apply-like commands to run it on your data or on different columns of the csv you load. That will save you from the need to change the same command in different lines given you (maybe at some point) feel the need to modify this workflow.

The error you get generally means that you try to subset a function. I am not sure I understand your code. After the first code block within for (k in 1:length(files)) { you do a couple of things on data and then use rm(data). Then you do y<-summary(lm(data$EGFR_24 ~ data$score))$coefficients without loading something into the data variable. Therefore data is now interpreted as the function utils::data and this error comes up. Check the code on why you remove data variable after this first code chunk.

Generally it is not recommended to use variable names such as data, sum, apply or mean as all of these also represent function names. Use something unique, such as my.data or tmp.data to avoid this misinterpretation.

ADD COMMENTlink modified 4 months ago • written 4 months ago by ATpoint26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1646 users visited in the last hour