Making a data table more comprehensive
3
1
Entering edit mode
7.4 years ago
Parham ★ 1.6k

Hi, I want to make a data table that I have, to something that is more comprehensive. In abstract this is what I intend to do:

> df
  cond_1 cond_2 cond_3 cond_4
1      a      a      b      e
2      b      b      c      f
3   <NA>      c      d      g

> df_t
  cond_1 cond_2 cond_3 cond_4
a      +      +      -      -
b      +      +      +      -
c      -      +      +      -
d      -      -      +      -
e      -      -      -      +
f      -      -      -      +
g      -      -      -      +

Is there any package or function that can do the job or facilitate coding? Any suggestion is a big help!

R data.frame transform • 1.5k views
ADD COMMENT
2
Entering edit mode
7.4 years ago
rownames = unique(unlist(df))
rownames = rownames[!is.na(rownames)]
df2 = sapply(df, function(x) ifelse(rownames %in% x,  "+", "-"))
df3 = as.data.frame(df2, stringsAsFactors = F, row.names = rownames)

> df3
  cond_1 cond_2 cond_3 cond_4
a      +      +      -      -
b      +      +      +      -
c      -      +      +      -
d      -      -      +      -
e      -      -      -      +
f      -      -      -      +
g      -      -      -      +
ADD COMMENT
0
Entering edit mode

Thank you all for your great and instructive solutions!

ADD REPLY
1
Entering edit mode
7.4 years ago
Steven Lakin ★ 1.8k
df <- data.frame(cond_1=c('a', 'b', NA),
             cond_2=c('a', 'b', 'c'),
             cond_3=c('b', 'c', 'd'),
             cond_4=c('e', 'f', 'g'))

df2 <- t(sapply(as.character(unique(unlist(df))), function(y) {
             lapply(df, function(x) {y %in% x})
        }))

df_t <- df2[!is.na(rownames(df2)), ]

# You can replace logicals like so

df_t[df_t == TRUE] <- '+'
df_t[df_t == FALSE] <- '-'

> df_t
cond_1 cond_2 cond_3 cond_4
a "+"    "+"    "-"    "-"   
b "+"    "+"    "+"    "-"   
c "-"    "+"    "+"    "-"   
d "-"    "-"    "+"    "-"   
e "-"    "-"    "-"    "+"   
f "-"    "-"    "-"    "+"   
g "-"    "-"    "-"    "+"
ADD COMMENT
1
Entering edit mode
7.4 years ago

Here you go:

df <- data.frame(cond_1=c('a', 'b', NA),
                 cond_2=c('a', 'b', 'c'),
                 cond_3=c('b', 'c', 'd'),
                 cond_4=c('e', 'f', 'g'))

library(reshape2)
df <- subset(dcast(melt(df, id.vars = c()), value~variable), !is.na(value))
rownames(df) <- df$value
df$value <- NULL

ifelse( is.na(df), "-", "+")
ADD COMMENT

Login before adding your answer.

Traffic: 2266 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6