Assigning an name based on matching three possible cells/columns
1
0
Entering edit mode
18 months ago

I have a data frame that I have created that looks like the image attached (small section for simplicity).

enter image description here

I am trying to find a way to match the two letters (in yellow) to the F1 (blue), B6 (pink), or NOD (green) and create a new column with the name of the assigned match. This data frame is quite large 100 matches to make and 1000+ SNPs to determine.

Any suggestions? I am stumped...

R Dataframe • 805 views
ADD COMMENT
0
Entering edit mode

Try using the dplyr functions mutate() and case_when() to create a new column with your matches. You'll probably want to change your column names first, since R doesn't always behave as expected when column names are numbers. Do you have to account for cases where there are zero matches / more than one match?

ADD REPLY
0
Entering edit mode

A reply of N/A is sufficient for zero matches (and there will be no more than one match).

ADD REPLY
0
Entering edit mode

Thank you!

dataset %>% select(SNP name, F1_mouse, B6_mouse, NOD_mouse, M547) %>% mutate( Genotype_M547 = case_when( M547 == F1_mouse ~ "HET", M547 == B6_mouse ~ "B6", M547 == NOD_mouse ~ "NOD", TRUE ~ "other" ) )

Just got to work out how to loop it now :)

ADD REPLY
0
Entering edit mode

Just for others, this did the trick;

aSNP <- aSNP %>% mutate_at(vars(one_of("M547", "M548", "M549", "M550", "M551", "M552", "M553", "M554", "M555", "M556", "M557", "M558", "M559", "M561", "M562", "M563", "M564", "M565", "M566", "M567", "M568", "M223", "M224", "M227", "M228", "M231", "M232", "M233", "M234", "M235", "M746", "M747", "M748", "M749", "M750", "M751", "M752", "M753", "M754", "M755", "M770", "M771", "M772", "M773", "M774", "M775", "M776", "M777", "M778", "M779", "M830", "M831", "M832", "M833", "M834", "M837", "M838", "M839", "M840")), funs(case_when(. == F1_mouse ~ "HET", . == B6_mouse ~ "B6",. == NOD_mouse ~ "NOD")))

ADD REPLY
0
Entering edit mode
18 months ago
Mensur Dlakic ★ 27k

What exactly did you try that stumped you?

This is fairly trivial to do, both in Excel and by using some programming. For the Excel solution, searching for multiple if conditions in excel should get you on track.

ADD COMMENT
0
Entering edit mode

I am wanting to know how I would approach this using R/dataframe. In excel I use the =IF('Combined data'!Z2='Combined data'!D2,"HET",IF('Combined data'!Z2='Combined data'!F2,"B6",IF('Combined data'!Z2='Combined data'!H2,"NOD"))) but would have to write it out 100+ times (Which I have done). Just wanted to know if someone could point me to the correct way to go about this using some programming.

ADD REPLY
0
Entering edit mode

but would have to write it out 100+ times

You don't have to do it 100+ times. Once is enough, and copy+paste after that. Cell numbers should be incremented automatically.

Using multiple IF-THEN statements is also how it would be done in any programming language after reading the dataframe in.

ADD REPLY

Login before adding your answer.

Traffic: 1444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6