Question: standard deviaition using R
rsabrina9310

how to perform standard deviation on multiple rows(27482) which contain multiple columns(702) using R? I tried the code below:



``````for(i in 1:nrow(zscore)){
print(i)
SD=sd(as.numeric(zscore[i,]))
zscore[i,702]=SD
}
``````
Devon Ryan

We can use `apply` and `sd` functions, see example:

``````# dummy reproducible data
zscore <- data.frame(matrix(1:10, nrow = 2))
# introduce NA
zscore[1, 2] <- NA
zscore
#   X1 X2 X3 X4 X5
# 1  1 NA  5  7  9
# 2  2  4  6  8 10

# below will give NAs
apply(zscore, 1, sd)
#        NA 3.162278

# we need to remove NAs before getting SD
apply(zscore, 1, sd, na.rm = TRUE)
#  3.415650 3.162278
``````

Does it calculate for the entire data frame?

yes. The apply function will execute the sd() function on each row separatly (as the second paramter of apply() specifies if the rows (=1) or the columns (=2) will be analyzed.

I get NAs as result for all the rows

Perhaps you need to `zscore = as.numeric(zscore)` first or use the `na.rm=T` option.

I am assuming Devon's example works fine for you and the problem is when you apply it to your data. Maybe there is a problem with your data not being all numeric, e.g. if you have a data frame with other variables (an so Devon's suggestion about using `as.numeric()` may or may not work). You could post a few lines of your dataset so that we can take a look at it.