Question: Counting the frequency of genotypes per row based on the calls of the first column in a data frame in R
1
gravatar for Famf
8 months ago by
Famf20
United States
Famf20 wrote:

I have a genotype data frame in R similar to this

ID  P1  P2  in1 in2 in3 in4
M01 CC  GG  CC  GG  CC  GG
M02 TT  CC  TT  TT  CC  TT
M03 AA  GG  AA  GG  GG  GG
M04 CC  GG  CC  GG  CC  GG
M05 GG  AA  AA  GG  AA  AA
M06 CC  GG  CC  GG  CC  CC

I want to add a column with the frequencies of all the genotypes in the column P1. I want to count starting from the column in1 onward per each row. Like the table below:

ID  P1  P2  in1 in2 in3 in4 frqP1
M01 CC  GG  CC  GG  CC  GG  2
M02 TT  CC  TT  TT  CC  TT  3
M03 AA  GG  AA  GG  GG  GG  1
M04 CC  GG  CC  GG  CC  GG  2
M05 GG  AA  AA  GG  AA  AA  1
M06 CC  GG  CC  GG  CC  CC  3

I was trying with following code but it doesn't work

df$frqP1 <- rowSums(df[-1] == df$P1)

Any idea?

genotype R • 318 views
ADD COMMENTlink modified 8 months ago by ATpoint19k • written 8 months ago by Famf20

it doesn't work

Does it throw an error (then add the error/warning message), does it give wrong output?

ADD REPLYlink written 8 months ago by zx87547.8k
2
gravatar for ATpoint
8 months ago by
ATpoint19k
Germany
ATpoint19k wrote:
df$frqP1 <- rowSums(df[-c(1:3)] == as.character(df$P1))

You were almost right. Just convert the query (df$P1) from factor level to character, and make sure that you really only keep the in-columns in the subject, so remove columns 1 to 3.

ADD COMMENTlink modified 8 months ago • written 8 months ago by ATpoint19k

Effectively, that works!. But I realized it returns a NA instead of a value in the column frqP1 for those rows that have at least one missing data (NA). Is there any way to avoid that?

ADD REPLYlink written 8 months ago by Famf20

Use na.rm=TRUE to ignore NAs. Read the manuals.

ADD REPLYlink modified 8 months ago • written 8 months ago by zx87547.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1651 users visited in the last hour