HELP: Excel Index & Match in R - How to do it?
1
0
Entering edit mode
10 months ago
Pepitostar • 0

Hi Biostar users! Im a bit new to R and I want someone to help me with this problem. Im trying to solve it using loop function in R, unfortunately, I cant do it. Please help, I'm solving this almost a day. In end I want to have an output file that is similar to the given example below. Thanks in advance.

df1 <- data.frame(stringsAsFactors = FALSE,
id = c('13B1', '13B2', '13B3', '13B4', '13B5', '13B6', '13B7', '13B8', '13B9', '13B10'),
A1 = c('ABB',  'CD',     'NA',   'CD',   'CD',   'CD',   'ABB',    'CD',     'CD',   'CD'),
A2 = c('ASD',  'ASD',  'ASD',  'ASD',  'ASD',  'ASD',  'ADD',  'ASD',  'ASD',  'ASD'),
A3 = c('WWW',  'WWW',  'NA',     'WWW',    'NA',     'WWW',    'NA',     'NA',   'XX',   'NA'),
A4 = c('QRS',  'QRS',  'QRS',  'QRS',  'QRS',  'QRS',  'QRS',  'QRS',  'QRS',  'QRS'),
A5 = c('HMM',  'HMM',  'HMM',  'HMM',  'HMM',  'HMM',  'HMM',  'HMM',  'HMM',  'ABB')
)

df2 <- data.frame(stringsAsFactors = FALSE,
Remark = c('Good',   'Bad'),
A1 = c('ABB','CD'),
A4 = c('QRS','FST'),
A3 = c('WWW','XX'),
A5 = c('HMM','ABB'),
)

output <- data.frame(stringsAsFactors = FALSE,
id = c('13B1',   '13B2', '13B3', '13B4', '13B5', '13B6', '13B7', '13B8', '13B9', '13B10'),
A3 = c('Good',   'Good', 'NA',   'Good', 'NA',   'Good', 'NA',   'NA',   'Bad',  'NA'),
A4 = c('Good',   'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Good'),
A5 = c('Good',   'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Good', 'Bad')
)

R snp • 272 views
3
Entering edit mode
10 months ago

Here's a tidyverse solution. The trick is to go from wide to long data, join the data.frames, and then go back to wide data.

library("dplyr")
library("tidyr")

df1 <- pivot_longer(df1, starts_with("A"), names_to="col", values_to="sample")
df2 <- pivot_longer(df2, starts_with("A"), names_to="col", values_to="sample")

output <- df1 %>%
left_join(df2, by=c("col", "sample")) %>%
select(-sample) %>%
pivot_wider(names_from="col", values_from="Remark")

>output
# A tibble: 10 x 6
id    A1    A2    A3    A4    A5
<chr> <chr> <chr> <chr> <chr> <chr>
1 13B1  Good  Bad   Good  Good  Good
2 13B2  Bad   Bad   Good  Good  Good
3 13B3  NA    Bad   NA    Good  Good
4 13B4  Bad   Bad   Good  Good  Good
5 13B5  Bad   Bad   NA    Good  Good
6 13B6  Bad   Bad   Good  Good  Good
7 13B7  Good  Good  NA    Good  Good
8 13B8  Bad   Bad   NA    Good  Good

0
Entering edit mode

OMG! I tried and it works. You are heaven-sent. Thanks a lot man for this clean and beautiful codes.

Note to the viewers: I have a difficulty of making this work due to the "pivot_longer" or tidyr function. You have to install the latest version of R to make this work. Mine was 3.5.3 :)

0
Entering edit mode

Glad it worked! pivot_longer and pivot_wider are from tidyr version 1.1.0 released this May 2020. These functions are replacements for gather and spread respectively.