Strategies to offset the effects of row-wise correction from missing data
0
0
Entering edit mode
4.0 years ago
strkiky2 • 0

Here's a dataset

data <- t(data.frame(met1 = c(2,2,2,2,2),
                   met2 = c(5,4,NA,2,1),
                   met3 = c(2,2,2,NA,2),
                   met4 = c(2,4,6,8,6),
                   met5 = c(1,3,4,7,2)))

This gives:

  [,1] [,2] [,3] [,4] [,5]
met1    2    2    2    2    2
met2    5    4   NA    2    1
met3    2    2    2   NA    2
met4    2    4    6    8    6
met5    1    3    4    7    2

I often conduct row-wise correction on my dataset. Which divide all the values after summing, meaning that all the values are between 0 and 1.

data <- data / rowSums(data, na.rm = TRUE)

This works great when there's no missing data. But as you can see when comparing met1 and met3, each value of met3 is considerably higher than met1 due to the missing data.

           [,1]      [,2]      [,3]      [,4]       [,5]
met1 0.20000000 0.2000000 0.2000000 0.2000000 0.20000000
met2 0.41666667 0.3333333        NA 0.1666667 0.08333333
met3 0.25000000 0.2500000 0.2500000        NA 0.25000000
met4 0.07692308 0.1538462 0.2307692 0.3076923 0.23076923
met5 0.05882353 0.1764706 0.2352941 0.4117647 0.11764706

How could I offset this effect? Currently I've removed any column with missing data, but I prefer not doing so as some important data could be removed.

R • 525 views
ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2239 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6