Question: Complement genotype
0
BlackHole0 wrote:

I am beginner in R, so apologize for such a simple question

I have two columns

``````V1 V2
T  1
C  0
A  1
``````

if the column V2 is 1, then I want to replace nucleotide complementary if 0 is left as is I wrote a for functiom (in my many rows of data), but after his performance I get

``````V1 V2 V3
T  1 NA
C  0 NA
A  1 T
``````

And i want get

`````` V1 V2 V3
T  1 T
C  0 C
A  1 T
``````

where is the mistake?

whether it is possible to do it without the for, for examle using apply?

R • 855 views
modified 2.8 years ago by Biostar ♦♦ 20 • written 4.0 years ago by BlackHole0

Not really a bioinformatics question, might get closed. Which command did you try?

i try to use

``````  for(i in nrow(Tri1_a)){

if(Tri1_a\$V2[i] == 1){
if(Tri1_a\$V1[i] == "T")
Tri1_a\$V3[i] = "A"
if(Tri1_a\$V1[i] == "A")
Tri1_a\$V3[i] = "T"
if(Tri1_a\$V1[i] == "G")
Tri1_a\$V3[i] = "C"
if(Tri1_a\$V1[i] == "C")
Tri1_a\$V3[i] = "G"
}
else{
Tri1_a\$V3[i] = Tri1_a\$V1[i]
}

i = i + 1
}
``````

There probably is a better way... Why the `i = i + 1`? Could you edit your first post to contain the output you would like to obtain?

0
ddiez1.9k wrote:

Unless I am misunderstanding something, I thing what you want is:

``````# sample dataset (note the stringsAsFactors = FALSE).
Tri1_a <- data.frame(
V1 = c("T", "C", "A"),
V2 = c(1, 0, 1),
V3 = c(NA, NA, "T"),
stringsAsFactors = FALSE
)
Tri1_a
V1 V2   V3
1  T  1 <NA>
2  C  0 <NA>
3  A  1    T

# replace NAs in V3 by the values at V1.
sel.na <- is.na(Tri1_a\$V3)
Tri1_a\$V3[sel.na] <- Tri1_a\$V1[sel.na]
Tri1_a
V1 V2 V3
1  T  1  T
2  C  0  C
3  A  1  T
``````

Make sure your sequence data are not factors (you can probably pass `stringsAsFactors = FALSE` to the reading function, like `read.table`), or you might get an error saying:

``````Warning message:
In `[<-.factor`(`*tmp*`, sel.na, value = c(1L, NA, 1L)) :
invalid factor level, NA generated
``````

And the result will not be as expected.