I am trying to re-code the following genotype data frame from biallelic code to "A", "B" code. corresponding to P1 and P2, respectively, which in turns correspond to the two parents. Please consider the following example.
SNP P1 P2 in1 in2 in3 in4 M01 CC GG CC GG CC GG M02 TT CC TT TT CC TT M03 AA GG AA GG GG GG M04 CC GG CC GG CC GG M05 GG AA AA GG AA AA M06 CC GG CC GG CC CC
So that all the individuals that have the same genotype of P1 would be "A" and the ones that carries the genotype from P2 would be "B". Like the following table:
SNP P1 P2 in1 in2 in3 in4 M01 CC GG A B A B M02 TT CC A A B A M03 AA GG A B B B M04 CC GG A B A B M05 GG AA B B B B M06 CC GG A B A A
I have tried the following code in R
df[is.na(df[4:7]) == as.character(df$P1)] = "A" df[is.na(df[4:7]) == as.character(df$P2)] = "B" Error in `[<-.data.frame`(`*tmp*`, is.na(df[4:7]) == as.character(df$P1), : unsupported matrix index in replacement
Any idea how to solve it?