Closed:Find all the combinations of mutation of a particular column and find their frequencies
1
1
Entering edit mode
4.9 years ago
xxxxxxxx ▴ 20

My file is like this-

Pcol-patient
Mcol-Mutation



   Pcol       Mcol
    P1      M1,M2,M5,M6
    P2      M1,M2,M3,M5
    P3      M4,M5,M7,M6

I want to find all the combination of Mcol elements and their frequency( combinatinatons that present in how many patients) according to Pcol i,e patient.

Expected output-

Mcol        freq
M1,M2        2
M1,M5        2
M1,M6        1
M2,M5        2
M2,M6        1
M5,M6        2
M1,M3        1
M2,M3        1
M4,M5        1
M4,M7        1
M4,M6        1
M7,M6        1

I have tried this-

x <- read.csv("file.csv" ,header = TRUE, stringsAsFactors = FALSE)
xx <- do.call(rbind.data.frame, 
              lapply(x$Mcol, function(i){
                n <- sort(unlist(strsplit(i, ",")))
                t(combn(n, 2))
              }))

data.frame(table(paste(xx[, 1], xx[, 2], sep = ",")))

It doesn't give the expected output

r mutation • 147 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2765 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6