Question: Remove inf and NA from data frame
0
gravatar for krushnach80
2.6 years ago by
krushnach80570
krushnach80570 wrote:

I have a dataframe which contains value of log2fold change but it contains inf and NA values i searched all over stack-exchange tried their solution but most of them seems not working few of them works but its not giving me the desired output ,need help it should't be that difficult i suppose .

my data frame

GENE_NAME `HSC_VS_CMP` `HSC_VS_GMP``HSC_VS_Monocytes`

  ACTL6A  -0.20084399  0.297430 -0.350876000
  ACTR8    -0.2925280   -0.158551  1.10747
  AICDA      inf         NA           inf
   ANP32B  -0.6549    -0.615725    -0.35858

I get error like this " default method not implemented for type 'list' "

So please suggest me how to remove all the inf and NA from the data frame

R • 13k views
ADD COMMENTlink modified 2.6 years ago by ddiez1.8k • written 2.6 years ago by krushnach80570
2

What have you tried (with code)? Presuming you just want to remove the rows, it's a simple (A) apply(d, 1, function(x) anyis.na(x) || is.infinite(x))) and then (B) subsetting accordingly.

ADD REPLYlink written 2.6 years ago by Devon Ryan91k

Alternative without using apply:

x[rowSumsis.na(x) | is.infinite(x)) == 0, ]

Note: either way, you may need to take special care of the GENE_NAME column. I suspect your are having trouble with that in your attempts so far (but as mentioned by Devon, please show us the code).

ADD REPLYlink written 2.6 years ago by ddiez1.8k
1

You can't is.infinite() dataframes, which is probably what resulted in the originally reported error.

ADD REPLYlink written 2.6 years ago by Devon Ryan91k
1

You are right! (didn't do enough testing...) Find this a bit inconsistent since is.na() works fine. Further searching points to this SO post where a viable solution is to implement is.infinite.data.frame method, e.g.:

is.finite.data.frame <- function(obj){
    sapply(obj,FUN = function(x) all(is.finite(x)))
}
ADD REPLYlink written 2.6 years ago by ddiez1.8k

what is the way to do it ?the way you suggested ,I know there are multiple ways but since Im learning and the at the same time I have to use them in the data sets so i have to just read and see which one is working.

So can you show me how do i remove NA ,inf and 0 if any from my data frame in a concise code

ADD REPLYlink written 2.6 years ago by krushnach80570

I can take out the GENE_NAME column and apply the same

ADD REPLYlink written 2.6 years ago by krushnach80570

I tried this

na.zero <- function (x) {
x[is.na(x)] <- 0
 return(x)

}

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by krushnach80570
1

What is the desired output?

ADD REPLYlink written 2.6 years ago by zx87548.0k

well I want to replace inf and NA with 0 .

ADD REPLYlink written 2.6 years ago by krushnach80570
3
gravatar for ddiez
2.6 years ago by
ddiez1.8k
Japan
ddiez1.8k wrote:

Alternative (borrowed from this SO answer):

# test data
d <- mtcars

# add NAs and Inf.
d[1,1] <- NA
d[2,2] <- NA
d[2,1] <- Inf
d[1,2] <- Inf

head(d)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           NA Inf  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag      Inf  NA  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

# the magic:
d <- do.call(data.frame, lapply(d, function(x) {
  replace(x, is.infinite(x) | is.na(x), 0)
  })
)

# note that you lose the row names.
head(d)
   mpg cyl disp  hp drat    wt  qsec vs am gear carb
1  0.0   0  160 110 3.90 2.620 16.46  0  1    4    4
2  0.0   0  160 110 3.90 2.875 17.02  0  1    4    4
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
ADD COMMENTlink written 2.6 years ago by ddiez1.8k

How do I keep the row-names ? or I can simple make a subset out of the original data set and cbind.data.frame().

ADD REPLYlink written 2.6 years ago by krushnach80570

That is inconvenient, isn't it? I would just assign the modified data.frame to a different variable (in the do.call part) and then copy the row names from the original data to the modified one.

ADD REPLYlink written 2.6 years ago by ddiez1.8k

okay ,thats a new concept to me , i will try and let know if I am able to do it.

ADD REPLYlink written 2.6 years ago by krushnach80570

You have half the answer in Devon's answer to your question. He assigns the result from lapply() to d2 instead of d. Also take a look at ?rownames.

ADD REPLYlink written 2.6 years ago by ddiez1.8k

To create a pipelinable rowname setter, you could replicate the setNames function setRownames <- function(x, rn){ row.names(x) <- rn x }

Then do d <- do.call(blah blah blah) %>% setRownames(d)

ADD REPLYlink written 2.6 years ago by russhh4.6k
3
gravatar for Devon Ryan
2.6 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

is.na() works on dataframes. For the Infs, lapply() a function:

d2 = lapply(d, function(x) {
    if(any(is.infinite(x))) {
         x[is.infinite(x)] = 0
    }
    return(x)
})
d2 = as.data.frame(d2)
ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Devon Ryan91k

I still see inf in my data frame

ADD REPLYlink written 2.6 years ago by krushnach80570
1

When I do it I don't see Inf, so show a reproducible example.

ADD REPLYlink written 2.6 years ago by Devon Ryan91k

This works for me, no problem (for example, using the dataset in my answer). Note, in the example data you included infinite is specified as "inf" not Inf (R's way). So maybe it is related?

ADD REPLYlink written 2.6 years ago by ddiez1.8k

OK, even if in your original data you had "inf" instead of "Inf" it seems read.table (and probably friends) ignore case (see ?read.table). So it is read properly and shouldn't be a problem (e.g. read.table(text = "inf 0 10").

ADD REPLYlink written 2.6 years ago by ddiez1.8k

Some guesswork, I would also check for: "inf" and "NA" as character, assign NA, then use complete.cases()

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by zx87548.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 539 users visited in the last hour