How to convert data in r?
3
0
Entering edit mode
7 months ago
star ▴ 350

I have an Input like below and would like to know how I can convert it into Output format.

Input:

data <- c(
 "A \t 2212 \t 2267 \t 2299",
" \t 2015 \t 2068 \t 2067",
"B \t 5210 \t 5289 \t 5293",
" \t 4583 \t 4604 \t 4205",
"C \t 4720 \t 4510 \t 4558"
 )%>% data.frame()

Output:

Group   C1    C2    C3
A       2212  2267  2299
A       2015  2068  2067
B       5210  5289  5293
B       4583  4604  4205
C       4720  4510  4558
R offtopic • 1.3k views
ADD COMMENT
0
Entering edit mode

This post does not fit the theme of this forum. It's a simple R question. Please add more biological context or the question will be deleted.

ADD REPLY
3
Entering edit mode
7 months ago
bk11 ★ 2.4k

Hope OP will add biological context to this query and answering-

data <- c(
  "A \t 2212 \t 2267 \t 2299",
  " \t 2015 \t 2068 \t 2067",
  "B \t 5210 \t 5289 \t 5293",
  " \t 4583 \t 4604 \t 4205",
  "C \t 4720 \t 4510 \t 4558"
)%>% data.frame()

lst <- as.data.frame(t(apply(data, 1, function(x) unname(unlist(strsplit(x, "\t"))))))
colnames(lst)=c("Group", "C1", "C2", "C3")
lst$Group=c("A","A","B","B","C")
lst
Group     C1     C2    C3
1     A  2212   2267   2299
2     A  2015   2068   2067
3     B  5210   5289   5293
4     B  4583   4604   4205
5     C  4720   4510   4558
ADD COMMENT
0
Entering edit mode

Great answer bk11.

star, you'll need the dplyr package to run this code. Just make the following the first line of code:

library(dplyr)
ADD REPLY
1
Entering edit mode

I don't see the need for the dplyr library in this code. c(), data.frame(), as.data.frame(), t(), apply(), unname(), unlist(), strsplit(), and colnames() are all base R functions.

ADD REPLY
2
Entering edit mode

You're missing the %>% before the data.frame(). Technically though, magrittr would suffice. In the process of removing unnecessary stuff, changing c(...) %>% data.frame() to data.frame(c(...)) will eliminate the need for dplyr/magrittr.

ADD REPLY
0
Entering edit mode

Ah, thanks for pointing that out.

%>% is made available with magrittr—are pipes in the dplyr namespace too? I wasn't aware of that.

Anyway, as of version 4.1, users can use base R to pipe as well: |> (more details).

ADD REPLY
3
Entering edit mode

While %>% is indeed exported by magrittr, dplyr is the most popular package to use it. dplyr imports magrittr, it does not define the %>% again. Like I said, while the code technically only needs magrittr to work, in all probability, importing dplyr will open the road for further downstream analysis while importing just magrittr won't do that. In other words, it is sufficient to import magrittr but it is more efficient to import dplyr.

ADD REPLY
2
Entering edit mode
7 months ago
zx8754 11k

Read as a tab delimited file(text), then fill the nas:

# read as text connection
x <- read.table(textConnection(data[[1]]), sep = "\t", na.strings = " ", 
                col.names = c("Group", "C1", "C2", "C3"))

#fill NAs with last known value
x$Group <- zoo::na.locf(x$Group)

x
#  Group   C1   C2   C3
# 1    A  2212 2267 2299
# 2    A  2015 2068 2067
# 3    B  5210 5289 5293
# 4    B  4583 4604 4205
# 5    C  4720 4510 4558
ADD COMMENT
1
Entering edit mode
7 months ago
biofalconch ★ 1.1k

A good start would be looking into stringr's str_split. Not the complete output but with some effort you can achieve the final result :)

library(stringr)
dataSpl <- str_split(data,pattern=" \t ",simplify=T)
print(dataSpl)
     [,1] [,2]   [,3]   [,4]
[1,] "A"  "2212" "2267" "2299"
[2,] ""   "2015" "2068" "2067"
[3,] "B"  "5210" "5289" "5293"
[4,] ""   "4583" "4604" "4205"
[5,] "C"  "4720" "4510" "4558"
ADD COMMENT

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6