How to replace multiple string columns to binary values (0 and 1) in a dataframe?
2
0
Entering edit mode
2.7 years ago

Hello! I am trying to get a binary matrix but firt I need to replace multiple string columns to binary values (0 and 1). I tried to get it in R and python but the code didn't work. I was wondering if someone could help me.

I have a matrix of 29,584 rows x 982 columns, similar like:

For each column that start with X, there are various string values. These values start in bb_, bpp_ and bp_. In addition, there are missing data (in blank). I would like to replace with 1 all the string values from each column that start witn X (or all columns except G) and to replace with 0 the missing data from the columns that start with X.

Please, I would be very grateful by their help.

I am attaching a imagen of the dataframe

dataframe R • 2.2k views
ADD COMMENT
3
Entering edit mode
2.7 years ago

Hope you know how to use R and load excel sheet in R, use following:

$ library(dplyr)
$ df %>% 
   mutate(across(!starts_with("G"), ~ifelse(.!="",1,0)))

Please post errors if there are any. Please do not post images of the data. Post data always.

ADD COMMENT
0
Entering edit mode

Thank you for your help cpad0112. This code work great! Also, thank you for the suggestions about how to post.

biogamer.31

ADD REPLY
2
Entering edit mode
2.7 years ago
Sam ★ 4.7k

Assuming your data is called dat, in R, do the following

dat <- read.table(xxx)
name <- dat$G
dat[dat!=""] <- 1
dat[dat==""] <- 0
dat$G <- name
ADD COMMENT
0
Entering edit mode

Thank you Sam. This code is an awesome help.

ADD REPLY

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6