standard deviaition using R
1
1
Entering edit mode
7.3 years ago
rsabrina93 ▴ 10

how to perform standard deviation on multiple rows(27482) which contain multiple columns(702) using R? I tried the code below:

code:

for(i in 1:nrow(zscore)){
    print(i)
    SD=sd(as.numeric(zscore[i,]))
    zscore[i,702]=SD
}
R • 1.9k views
ADD COMMENT
3
Entering edit mode
7.3 years ago

We can use apply and sd functions, see example:

# dummy reproducible data
zscore <- data.frame(matrix(1:10, nrow = 2))
# introduce NA
zscore[1, 2] <- NA
zscore
#   X1 X2 X3 X4 X5
# 1  1 NA  5  7  9
# 2  2  4  6  8 10

# below will give NAs
apply(zscore, 1, sd)
# [1]       NA 3.162278

# we need to remove NAs before getting SD
apply(zscore, 1, sd, na.rm = TRUE)
# [1] 3.415650 3.162278
ADD COMMENT
0
Entering edit mode

Does it calculate for the entire data frame?

ADD REPLY
3
Entering edit mode

yes. The apply function will execute the sd() function on each row separatly (as the second paramter of apply() specifies if the rows (=1) or the columns (=2) will be analyzed.

ADD REPLY
0
Entering edit mode

I get NAs as result for all the rows

ADD REPLY
2
Entering edit mode

Perhaps you need to zscore = as.numeric(zscore) first or use the na.rm=T option.

ADD REPLY
0
Entering edit mode

I am assuming Devon's example works fine for you and the problem is when you apply it to your data. Maybe there is a problem with your data not being all numeric, e.g. if you have a data frame with other variables (an so Devon's suggestion about using as.numeric() may or may not work). You could post a few lines of your dataset so that we can take a look at it.

ADD REPLY

Login before adding your answer.

Traffic: 1814 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6