Question: Merge columns, dplyr
gravatar for innyus
3 months ago by
innyus20 wrote:


I am sure there is a pretty simple way to do this but can not figure it out. My data is a csv file with header (patient id; time to relapse (months); time to transformation (in months). One row/patient. Lymphoma cohort.

I want to create a swimmer plot where one can see relapse and/or transformation.

I need to transform data to this format, eg

ID Response Time (months) 1. Relapse. 10 1 Relapse. 17 1 Transf. 25 1 Censored 56

I only need the ”start” time of relapse and/or transformation. I have seen the swimmer plot tutorial, so as long as I can create this format I will be able to create the plot.

R • 143 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by innyus20

It is not entirely clear for me what you want to do. So if I understand correctly, what you have is the following :

           id      time_to_relapse      time_to_transformation
p1      001      xx                            yy
p2      002      xx1                          yy

can you please post another clear example of what you want the data to look like after transforming?

ADD REPLYlink written 3 months ago by manaswwm130

sorry, I was very unclear.

So, in my "raw" file I have columns: ID, State1 (relapse/transformation), Time1 (time in months from diagnosis to State1), State2 (relapse / transformation), Time2 (time from diagnosis in months to State 2), State3 (death/ censored), Time3 (time from diagnosis to censoring / death), Status (dead / censored at last follow-up), OS (time from diagnosis to last follow-up or death)

I want to merge the State columns together for each patient in one column, same for Time

The end result would be 3 column: ID, State, Time

I dont know no how to post e.g. data frame example but e.g. Patient_ID1 would have State1 = Relapse Time1 = 10, State2 = Transformation, Time2 = 20, State3 = Death Time3=40 on so on for other patients

ADD REPLYlink modified 3 months ago • written 3 months ago by innyus20

this is also not super informative, sorry. Roughly speaking - you have 3 columns (for state and time) and you want to merge them into one? and then delete the individual 3 columns?

I will make an attempt, lets say your data is in data frame patient_data and you want to perform the operations on columns state_1, state_2 & state_3 and time_1, time_2 & time_3 and merge them into column state and time , then you can do :

patient_data$state = paste(patient_data$state_1,patient_data$state_2,patient_data$state_3 ,sep=" ") #any value within " " will be how your 3 *states* will be seperated 

#same for time

#now your patient_data has 11 columns, to keep only 3 of them you can use this trick:
col_names = c("ID", "state", "time")
patient_data_spliced = patient_data[,col_names]

Sources: 1) for combining columns - 2) for splicing certain columns in dataframe -

ADD REPLYlink written 3 months ago by manaswwm130

Thank you for this! I actually figured it out on my own and my swimmer plot look fine now :) I just did not (do not) know how to post questions so that one can see it like R dataframe

ADD REPLYlink written 3 months ago by innyus20

Instead of all those textual descriptions please simply provide representative inout and desired output examples.

ADD REPLYlink written 3 months ago by ATpoint38k

I would if I only knew how...sorry, beginner.

ADD REPLYlink written 3 months ago by innyus20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour