recoding genotype in R

Off topic:recoding genotype in R

0

Entering edit mode

6.7 years ago

Famf ▴ 30

I am trying to re-code the following genotype data frame from biallelic code to "A", "B" code. corresponding to P1 and P2, respectively, which in turns correspond to the two parents. Please consider the following example.

SNP P1  P2  in1 in2 in3 in4
M01 CC  GG  CC  GG  CC  GG
M02 TT  CC  TT  TT  CC  TT
M03 AA  GG  AA  GG  GG  GG
M04 CC  GG  CC  GG  CC  GG
M05 GG  AA  AA  GG  AA  AA
M06 CC  GG  CC  GG  CC  CC

So that all the individuals that have the same genotype of P1 would be "A" and the ones that carries the genotype from P2 would be "B". Like the following table:

SNP P1  P2  in1 in2 in3 in4
M01 CC  GG  A   B   A   B
M02 TT  CC  A   A   B   A
M03 AA  GG  A   B   B   B
M04 CC  GG  A   B   A   B
M05 GG  AA  B   B   B   B
M06 CC  GG  A   B   A   A

I have tried the following code in R

df[is.na(df[4:7]) == as.character(df$P1)] = "A"
df[is.na(df[4:7]) == as.character(df$P2)] = "B"

Error in `[<-.data.frame`(`*tmp*`, is.na(df[4:7]) == as.character(df$P1),  : 
  unsupported matrix index in replacement

Any idea how to solve it?

recode genotype R • 1.5k views

ADD COMMENT • link 6.7 years ago by Famf ▴ 30

This thread is not open. No new answers may be added