Question: Complement genotype
0
gravatar for BlackHole
2.6 years ago by
BlackHole0
RF
BlackHole0 wrote:

I am beginner in R, so apologize for such a simple question

I have two columns

V1 V2
 T  1
 C  0
 A  1

if the column V2 is 1, then I want to replace nucleotide complementary if 0 is left as is I wrote a for functiom (in my many rows of data), but after his performance I get

V1 V2 V3
 T  1 NA
 C  0 NA
 A  1 T

And i want get

 V1 V2 V3
 T  1 T
 C  0 C
 A  1 T

where is the mistake?

whether it is possible to do it without the for, for examle using apply?

R • 689 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 2.6 years ago by BlackHole0

Not really a bioinformatics question, might get closed. Which command did you try?

ADD REPLYlink written 2.6 years ago by WouterDeCoster40k

i try to use

  for(i in nrow(Tri1_a)){

     if(Tri1_a$V2[i] == 1){
      if(Tri1_a$V1[i] == "T")
        Tri1_a$V3[i] = "A"
      if(Tri1_a$V1[i] == "A")
        Tri1_a$V3[i] = "T"
      if(Tri1_a$V1[i] == "G")
        Tri1_a$V3[i] = "C"
      if(Tri1_a$V1[i] == "C")
        Tri1_a$V3[i] = "G"
    }
    else{
      Tri1_a$V3[i] = Tri1_a$V1[i]
    }

    i = i + 1
  }
ADD REPLYlink modified 2.6 years ago by WouterDeCoster40k • written 2.6 years ago by BlackHole0

There probably is a better way... Why the i = i + 1? Could you edit your first post to contain the output you would like to obtain?

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by WouterDeCoster40k
0
gravatar for ddiez
2.6 years ago by
ddiez1.8k
Japan
ddiez1.8k wrote:

Unless I am misunderstanding something, I thing what you want is:

# sample dataset (note the stringsAsFactors = FALSE).
Tri1_a <- data.frame(
  V1 = c("T", "C", "A"),
  V2 = c(1, 0, 1),
  V3 = c(NA, NA, "T"),
  stringsAsFactors = FALSE
)
Tri1_a
  V1 V2   V3
1  T  1 <NA>
2  C  0 <NA>
3  A  1    T

# replace NAs in V3 by the values at V1.
sel.na <- is.na(Tri1_a$V3)
Tri1_a$V3[sel.na] <- Tri1_a$V1[sel.na]
Tri1_a
  V1 V2 V3
1  T  1  T
2  C  0  C
3  A  1  T

Make sure your sequence data are not factors (you can probably pass stringsAsFactors = FALSE to the reading function, like read.table), or you might get an error saying:

Warning message:
In `[<-.factor`(`*tmp*`, sel.na, value = c(1L, NA, 1L)) :
  invalid factor level, NA generated

And the result will not be as expected.

ADD COMMENTlink written 2.6 years ago by ddiez1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1020 users visited in the last hour