How to convert data in r?
3
0
Entering edit mode
9 weeks ago
star ▴ 350

I have an Input like below and would like to know how I can convert it into Output format.

Input:

data <- c(
"A \t 2212 \t 2267 \t 2299",
" \t 2015 \t 2068 \t 2067",
"B \t 5210 \t 5289 \t 5293",
" \t 4583 \t 4604 \t 4205",
"C \t 4720 \t 4510 \t 4558"
)%>% data.frame()


Output:

Group   C1    C2    C3
A       2212  2267  2299
A       2015  2068  2067
B       5210  5289  5293
B       4583  4604  4205
C       4720  4510  4558

R offtopic • 1.1k views
0
Entering edit mode

This post does not fit the theme of this forum. It's a simple R question. Please add more biological context or the question will be deleted.

3
Entering edit mode
9 weeks ago
bk11 ★ 1.8k

data <- c(
"A \t 2212 \t 2267 \t 2299",
" \t 2015 \t 2068 \t 2067",
"B \t 5210 \t 5289 \t 5293",
" \t 4583 \t 4604 \t 4205",
"C \t 4720 \t 4510 \t 4558"
)%>% data.frame()

lst <- as.data.frame(t(apply(data, 1, function(x) unname(unlist(strsplit(x, "\t"))))))
colnames(lst)=c("Group", "C1", "C2", "C3")
lst$Group=c("A","A","B","B","C") lst Group C1 C2 C3 1 A 2212 2267 2299 2 A 2015 2068 2067 3 B 5210 5289 5293 4 B 4583 4604 4205 5 C 4720 4510 4558  ADD COMMENT 0 Entering edit mode Great answer bk11. star, you'll need the dplyr package to run this code. Just make the following the first line of code: library(dplyr)  ADD REPLY 1 Entering edit mode I don't see the need for the dplyr library in this code. c(), data.frame(), as.data.frame(), t(), apply(), unname(), unlist(), strsplit(), and colnames() are all base R functions. ADD REPLY 2 Entering edit mode You're missing the %>% before the data.frame(). Technically though, magrittr would suffice. In the process of removing unnecessary stuff, changing c(...) %>% data.frame() to data.frame(c(...)) will eliminate the need for dplyr/magrittr. ADD REPLY 0 Entering edit mode Ah, thanks for pointing that out. %>% is made available with magrittrâ€”are pipes in the dplyr namespace too? I wasn't aware of that. Anyway, as of version 4.1, users can use base R to pipe as well: |> (more details). ADD REPLY 3 Entering edit mode While %>% is indeed exported by magrittr, dplyr is the most popular package to use it. dplyr imports magrittr, it does not define the %>% again. Like I said, while the code technically only needs magrittr to work, in all probability, importing dplyr will open the road for further downstream analysis while importing just magrittr won't do that. In other words, it is sufficient to import magrittr but it is more efficient to import dplyr. ADD REPLY 2 Entering edit mode 9 weeks ago zx8754 11k Read as a tab delimited file(text), then fill the nas: # read as text connection x <- read.table(textConnection(data[[1]]), sep = "\t", na.strings = " ", col.names = c("Group", "C1", "C2", "C3")) #fill NAs with last known value x$Group <- zoo::na.locf(x\$Group)

x
#  Group   C1   C2   C3
# 1    A  2212 2267 2299
# 2    A  2015 2068 2067
# 3    B  5210 5289 5293
# 4    B  4583 4604 4205
# 5    C  4720 4510 4558

1
Entering edit mode
9 weeks ago
biofalconch ★ 1.1k

A good start would be looking into stringr's str_split. Not the complete output but with some effort you can achieve the final result :)

library(stringr)
dataSpl <- str_split(data,pattern=" \t ",simplify=T)
print(dataSpl)
[,1] [,2]   [,3]   [,4]
[1,] "A"  "2212" "2267" "2299"
[2,] ""   "2015" "2068" "2067"
[3,] "B"  "5210" "5289" "5293"
[4,] ""   "4583" "4604" "4205"
[5,] "C"  "4720" "4510" "4558"