Error in y - mu : non-numeric argument to binary operator - linear regression loop
1
1
Entering edit mode
3.2 years ago
camillab. ▴ 160

Hi!

I am trying to perform linear regression for age and sex on a dataset with 6 samples and 16757 genes creating a loop. this is my dataset ( I copied the first columns):

'data.frame':   6 obs. of  16757 variables:
 $ samples               : chr  "hu-c_lab13" "hu-c_lab15" "hu-c_lab17" "hu-gent_lab14" ...
 $ treatment             : chr  "untreated" "untreated" "untreated" "treatment" ...
 $ sex                   : chr  "Male" "Female" "Male" "Female" ...
 $ age                   : num  45 56 46 65 21 75
 $ 7SK (i)               : num  87779 79828 64005 44973 42646 ...

I want to do a loop to identify if age and sex affect the gene expression and I wanted to obtain the fitted.values

prova$treatment <- factor(prova$treatment, levels=c("treatment","untreated"))
prova$sex <- factor(prova$sex, levels=c("Female","Male"))
prova$age <- as.numeric(prova$age)

genelist <- prova %>% select(5:16757) #select genes

for (i in 1:length(genelist)) {
  formula <- as.formula(paste("samples ~ ", genelist[i], " + age + sex ", sep=""))
  model <- glm(formula, data = prova)
  print(model[["fitted.values"]])
}

but it gives me

Error in y - mu : non-numeric argument to binary operator

what do I do wrong in the loop?

also if I do for single gene it works:

model2 <- lm(ENSG00000202198 ~ sex + age , data=prova)
summary(model2)
model$fitted.values <- predict(model2)
gene <- model2[["fitted.values"]]
gene  <- as.data.frame(gene)

Thank you

Camilla

bulkRNAseq R linear regression • 3.2k views
ADD COMMENT
1
Entering edit mode
3.2 years ago
Sam ★ 4.7k

One of your gene might have non-numeric content. You can check that by outputting i in your loo

ADD COMMENT
0
Entering edit mode

like this (apologise I am really bad with loops...)?

   for (gene in 1:length(genelist)) {
  formula <- as.formula(paste("samples ~ ", genelist, " + age + sex ", sep=""))
  model <- glm(formula, data = prova)
  print(model[["fitted.values"]])
}

also I checked if there was any character in the genes am I interested to (sapply(prova[5:16753], class) ) and there is no characters there. may be something to do with the fact that sex is a character?

ADD REPLY
1
Entering edit mode
for (gene in genelist) {
  print(gene)
  formula <- as.formula(paste("samples ~ ", gene, " + age + sex ", sep=""))
  model <- glm(formula, data = prova)
  print(model[["fitted.values"]])
}

Other than non-numeric, you can also check for infinite. This loop should show the name of the last gene that caused the problem, which should allow you to debug the problem. (It is also possible that your samples column is non-numeric)

Sam

ADD REPLY

Login before adding your answer.

Traffic: 1377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6