log2 normalization produces NaNs
2
0
Entering edit mode
2.9 years ago

I want to calculate the global median normalization on 4 arrays using Cy5 background subtracted from Cy5 foreground values.

for(i in 1:4) {
  name <- paste("sample", i, sep = ".")
  bg <- maRb(dat[,i])   fg <- maRf(dat[,i])
  diff <- fg - bg
} 

assign(name, log2(diff))

data.prenorm <- cbind(sample.1, sample.2, sample.3, sample.4)
data.median  <- apply(data.prenorm, 2, median, na.rm = T)
data.norm <- sweep(data.prenorm, 2, data.median)

colnames(data.norm) <- c("Array 1", "Array 2", "Array 3", "Array 4")

median(data.norm[ , 1], na.rm = T)  median(data.norm[ , 2], na.rm = T)
median(data.norm[ , 3], na.rm = T) median(data.norm[ , 4], na.rm = T)

My code produces a warning message in R:

In assign(name, log2(diff)) : NaNs produced
r • 1.5k views
ADD COMMENT
0
Entering edit mode

What Mensur Dlakic said + I would encourage you to use a dedicated package for normalization of data. Most likely there are several out there, arrays are really not new technology and extensive methodology for analysis has been developed already.

ADD REPLY
1
Entering edit mode
2.9 years ago
Mensur Dlakic ★ 27k

Logarithm of zero is undefined. Try adding 1 to all the values before applying the log function.

ADD COMMENT
0
Entering edit mode
2.9 years ago

Thank you for your suggestions.

Firstly, instead of adding 1 to all the values, I decided to assign all the negative values as NA instead.

for(i in 1:4){
  name <- paste("sample", i, sep = ".")
  bg <- maRb(dat[,i])
  fg <- maRf(dat[,i])
  diff <- fg - bg
  diff[diff < 0] <- NA
  assign(name, log2(diff))
} 

However, after that, I have the following issue. I want to calculate the global median normalization on these 4 arrays using the log2(diff) values, such that all the arrays will have a median of 1 after normalization.

data.prenorm <- cbind(sample.1, sample.2, sample.3, sample.4)
data.median  <- apply(data.prenorm, 2, median, na.rm = T)
data.norm    <- sweep(data.prenorm, 2, data.median)

colnames(data.norm) <- c("Array 1", "Array 2", "Array 3", "Array 4")

median(data.norm[ , 1], na.rm = T) 
median(data.norm[ , 2], na.rm = T)
median(data.norm[ , 3], na.rm = T)
median(data.norm[ , 4], na.rm = T)

However, all the median evaluates to 0 instead of 1. Why?

ADD COMMENT

Login before adding your answer.

Traffic: 2301 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6