Question

Making a data table more comprehensive

1

Entering edit mode

7.4 years ago

Parham ★ 1.6k

Hi, I want to make a data table that I have, to something that is more comprehensive. In abstract this is what I intend to do:

> df
  cond_1 cond_2 cond_3 cond_4
1      a      a      b      e
2      b      b      c      f
3   <NA>      c      d      g

> df_t
  cond_1 cond_2 cond_3 cond_4
a      +      +      -      -
b      +      +      +      -
c      -      +      +      -
d      -      -      +      -
e      -      -      -      +
f      -      -      -      +
g      -      -      -      +

Is there any package or function that can do the job or facilitate coding? Any suggestion is a big help!

R data.frame transform • 1.5k views

ADD COMMENT • link updated 7.4 years ago by mikhail.shugay 3.5k • written 7.4 years ago by Parham ★ 1.6k

1

Entering edit mode

7.4 years ago

Steven Lakin ★ 1.8k

df <- data.frame(cond_1=c('a', 'b', NA),
             cond_2=c('a', 'b', 'c'),
             cond_3=c('b', 'c', 'd'),
             cond_4=c('e', 'f', 'g'))

df2 <- t(sapply(as.character(unique(unlist(df))), function(y) {
             lapply(df, function(x) {y %in% x})
        }))

df_t <- df2[!is.na(rownames(df2)), ]

# You can replace logicals like so

df_t[df_t == TRUE] <- '+'
df_t[df_t == FALSE] <- '-'

> df_t
cond_1 cond_2 cond_3 cond_4
a "+"    "+"    "-"    "-"   
b "+"    "+"    "+"    "-"   
c "-"    "+"    "+"    "-"   
d "-"    "-"    "+"    "-"   
e "-"    "-"    "-"    "+"   
f "-"    "-"    "-"    "+"   
g "-"    "-"    "-"    "+"

ADD COMMENT • link 7.4 years ago by Steven Lakin ★ 1.8k

1

Entering edit mode

7.4 years ago

mikhail.shugay 3.5k

Here you go:

df <- data.frame(cond_1=c('a', 'b', NA),
                 cond_2=c('a', 'b', 'c'),
                 cond_3=c('b', 'c', 'd'),
                 cond_4=c('e', 'f', 'g'))

library(reshape2)
df <- subset(dcast(melt(df, id.vars = c()), value~variable), !is.na(value))
rownames(df) <- df$value
df$value <- NULL

ifelse( is.na(df), "-", "+")

ADD COMMENT • link 7.4 years ago by mikhail.shugay 3.5k

score 2 · Accepted Answer · 2016-11-23

2

Entering edit mode

7.4 years ago

Santosh Anand 5.7k

rownames = unique(unlist(df))
rownames = rownames[!is.na(rownames)]
df2 = sapply(df, function(x) ifelse(rownames %in% x,  "+", "-"))
df3 = as.data.frame(df2, stringsAsFactors = F, row.names = rownames)

> df3
  cond_1 cond_2 cond_3 cond_4
a      +      +      -      -
b      +      +      +      -
c      -      +      +      -
d      -      -      +      -
e      -      -      -      +
f      -      -      -      +
g      -      -      -      +