Comparing 2 Columns at once
2
1
Entering edit mode
3.5 years ago
mail2steff ▴ 60

I am new to R programming. I have a data frame with 120 columns and 518 rows. Now I have to compare columns to columns (2 at once). If two values in successive columns are same 0 ( if not same -> 1) should be added to a new data frame

>data
V1 V2 V3 V4 V5 V6
A  A  C  C  G  G
A  G  T  T  C  G
G  C  T  A  A  C

The output should look like

>new_data_fram
V12 V34 V45
0   0   0
1   0   1
1   1   1

Can anyone help me with this? Thank you in advance

R seq • 760 views
ADD COMMENT
1
Entering edit mode

You're skipping a cpl of cols in your output example. Did you try any code in R? If so, show it along with any errors. If not, try something and come back with it.

ADD REPLY
0
Entering edit mode

I tried with combn fucntion in R.
compare = t(combn(ncol(file8),2,FUN=function(x)file8[,x[1]]==file8[,x[2]])) But I got the following output

V1  V2  V3  V4  V5  V6`

1 1 1 1 1 1

0 0 0 0 0 0

0 0 0 0 0 0

ADD REPLY
1
Entering edit mode
3.5 years ago
zx8754 10k

Taking advantage of recycling in R, we can do as below:

# data
df1 <- read.table(text = "V1 V2 V3 V4 V5 V6
A  A  C  C  G  G
A  G  T  T  C  G
G  C  T  A  A  C", header = TRUE, stringsAsFactors = FALSE)

# compare odd columns with even using recycling, then convert to number 0,1.
(!df1[, c(TRUE, FALSE)] == df1[, c(FALSE, TRUE)]) * 1
#      V1 V3 V5
# [1,]  0  0  0
# [2,]  1  0  1
# [3,]  1  1  1
ADD COMMENT
1
Entering edit mode

thank u so much . It worked perfectly

ADD REPLY
0
Entering edit mode
3.5 years ago
shoujun.gu ▴ 370

here is the python code, replace the real file name in the first two lines:

input_file='your_input_file'
output_file='your_output_file'

import pandas as pd

df=pd.read_csv(input_file, index_col=0)
col=df.columns
col_t=col[:-1]

new_col=[col_t[i]+str(i+2) for i in range(len(col_t))]

for i in range(len(col_t)):
    df[new_col[i]]=(df[col[i]]==df[col[i+1]]).astype(int)

df=df.loc[:,new_col]
df.to_csv('output_file')
ADD COMMENT
0
Entering edit mode

Thank you for the reply. Ill try this also

ADD REPLY

Login before adding your answer.

Traffic: 1874 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6