Question: Lost in "if else" statement R
1
gravatar for Lila M
23 months ago by
Lila M 460
UK
Lila M 460 wrote:

Hi every body! I'm trying to perform an R loop to processes one of my data frame in which I've stored ChiP seq data. My data frame looks like this

chr1    700245  714068  -   13824   uc001abo.3
chr1    934342  935552  -   1211    uc001aci.2
chr1    1189292 1203372 -   14081   uc001adm.4
chr1    1189292 1209234 -   19943   uc001ado.3
chr1    1243994 1247057 +   3064    uc001aed.3

And I want to handle the 2nd and 3rd columns (adding and subtracting some values). To do that, I've create the code as follow:

import_file= read.delim("file", sep="\t", header = F)
file =as.data.frame.matrix(import_file)
#length( file$V6)

for (i in length file$V6)){

if (any( file$V4 == '-')) {
file$V2 =  file$V3-150
file$V3 =  file$V3 +50
 } 
else {   
#if file$V4 == '+' do this
file$V2 =  file$V2-50
file$V3 =  file$V2 +150
   } 
}

The first "if" statement is processed fine, but for "else", the loop repeat the "if" statement for the other condition. What I want to do is

 if (any( file$V4 == '-')) {
    file$V2 =  file$V3-150
    file$V3 =  file$V3 +50
#works

and

 if file$V4 == '+' do this
    file$V2 =  file$V2-50
    file$V3 =  file$V2 +150
#doesn't work

Any help?

Thanks a lot!

if chip-seq else loop R • 597 views
ADD COMMENTlink modified 23 months ago • written 23 months ago by Lila M 460

Hi guys, your approaches perform the calculation over the modified data (after adding or subtracting) not over the original data frame (that is exactly what I want) :)

ADD REPLYlink written 23 months ago by Lila M 460

Please use ADD COMMENT/ADD REPLY (or add this to the original question) to keep threads logically organized.

ADD REPLYlink written 23 months ago by genomax62k
3
gravatar for Devon Ryan
23 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

You're not subsetting file inside your for loop, so if(any(file$V4 == '-')) will always be true. Get rid of the any() and only change the ith entry.

Having said that:

idx = which(file$V4 == '-')
file$V2[idx] =  file$V3[idx] - 150
file$V3[idx] =  file$V3[idx] +50
idx = which(file$V4 != '-')
file$V2[idx] =  file$V2[idx] - 50
file$V3[idx] =  file$V2[idx] +150

Alternatively, convert this to a GRanges object and use flank(), which is strand aware.

ADD COMMENTlink written 23 months ago by Devon Ryan88k
1

You beat me by one second :D

ADD REPLYlink written 23 months ago by Carlo Yague4.4k

I typed less, that's why :)

ADD REPLYlink written 23 months ago by Devon Ryan88k
2
gravatar for Carlo Yague
23 months ago by
Carlo Yague4.4k
Belgium
Carlo Yague4.4k wrote:

I think you should use [i] in your if/else. Or I don't understand the purpose of that loop.

import_file= read.delim("file", sep="\t", header = F)
file =as.data.frame.matrix(import_file)
#length( file$V6)

for (i in c(1:length(file$V6))){
  if (( file$V4 [i]== '-')) {
    file$V2[i] =  file$V3[i]-150
    file$V3[i] =  file$V3[i] +50
   } 
  else {   
  #if file$V4[i] == '+' do this
    file$V2[i] =  file$V2[i]-50
    file$V3[i] =  file$V2[i] +150
   } 
}

PS : indentation is your friend.

PPS : This code could be much more efficient without a loop:

file$V2[which(file$V4=='-')]=file$V3[which(file$V4=='-')]-150
file$V3[which(file$V4=='-')]=file$V3[which(file$V4=='-')]+50
...
ADD COMMENTlink modified 23 months ago • written 23 months ago by Carlo Yague4.4k

Thank you for the advice! The first code that you propose doesn't work at all, the second one is fine for "-" but when I do the same in the "+"

unique_intersect$V2[which(unique_intersect$V4=='+')]=unique_intersect$V2[which(unique_intersect$V4=='+')]-50
unique_intersect$V3[which(unique_intersect$V4=='+')]=unique_intersect$V2[which(unique_intersect$V4=='+')]+150

the result for unique_intersect$V3 is calculated using unique_intersect$V2[which(unique_intersect$V4=='+')]-50 , so the statement doesn't do exactly what I want. Any way, I keep trying!!!

ADD REPLYlink written 23 months ago by Lila M 460
1

Oh I see, there is a grammatical error (that I copied pasted from your code). for (i in length file$V6)){ should be for (i in c(1:length(file$V6))){. It is now fixed.

Once corrected, the code outputs this : Is it what you want ?

    V1      V2      V3 V4    V5         V6
1 chr1  713918  714118  - 13824 uc001abo.3
2 chr1  935402  935602  -  1211 uc001aci.2
3 chr1 1203222 1203422  - 14081 uc001adm.4
4 chr1 1209084 1209284  - 19943 uc001ado.3
5 chr1 1243944 1244094  +  3064 uc001aed.3
ADD REPLYlink modified 23 months ago • written 23 months ago by Carlo Yague4.4k
1
gravatar for zjhzwang
23 months ago by
zjhzwang180
zjhzwang180 wrote:

I think you can do it by another way:

library(dplyr)
data <- tbl_df(read.table("file_path", header = F, stringsAsFactors = F))
#
data_1 <- filter(data, V4 == "+")
data_1$V2 =  data_1$V2-50
data_1$V3 =  data_1$V2 +150
#
data_2 <- filter(data, V4 == "-")
data_2$V2 =  data_2$V3 - 150
data_2$V3 =  data_2$V3 + 50
#
result <- rbind(data_1, data_2)
ADD COMMENTlink written 23 months ago by zjhzwang180
0
gravatar for LLTommy
23 months ago by
LLTommy1.2k
LLTommy1.2k wrote:

I haven't done any R for a while but: Are you sure any() is doing what you expect it to do? I just googled it and it says about any, all: 'Check whether any or all of the elements of a vector are TRUE.' This suggest to me that if one line of the vector (=column in your case) is true, the expression returns true. So that behaviour would make perfectly sense (because you aways have a '-' in that column that you posted). So I suggest that you investigate this and I think you have to change your condition a little bit.

ADD COMMENTlink written 23 months ago by LLTommy1.2k
1

Ok, while I typed this some other people spotted the same thing. Problem solved I'd say.

ADD REPLYlink written 23 months ago by LLTommy1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2269 users visited in the last hour