Question

The I variable in the for loop changes

0

Entering edit mode

23 months ago

3095916029 • 0

I set ‘n’ in advance, and then I do a for loop to extract the desired columns from a data box

n<-as.numeric(all_apa1$distal)

enter image description here

But, as we go through the loop, it's like I becomes 1,2,3

data1<-data.frame()
for(i in n){
data<-m6a_seq[which(as.numeric(m6a_seq$start) - i <200 & as.numeric(m6a_seq$start) - i >-200 ),]
data$position<-as.character(i)
data1<-rbind(data1, data)
}

enter image description here

So can someone please tell me how to deal with it?

R • 1.1k views

ADD COMMENT • link updated 23 months ago by Michael 54k • written 23 months ago by 3095916029 • 0

score 0 · Answer 1 · 2022-05-30

0

Entering edit mode

23 months ago

Michael 54k

I don't think you have given enough information for us to figure out if anything at all, not to mention what is wrong here. The only thing I can say is that R is rock-solid in guaranteeing that a variable will not change without an assignment to it. So, if i becomes 1, 2, 3, these were in the original vector n.

> for (i in c(10,1,42)) print(i)
[1] 10
[1] 1
[1] 42

The only reason you might get an unexpected result is, and that is very likely, if the original data was in fact a factor value:

n=as.numeric(as.factor(c(10,1,42)))
> n
[1] 2 1 3
> str(n)
 num [1:3] 2 1 3

That might seem unexpected, however, it is consistent and documented behavior.

A quick fix is to either:

n=as.numeric(as.character(as.factor(c(10,1,42))))
> n
[1] 10  1 42

Or better, define proper column classes while importing or turn off strings.as.factors.

Anyway, you seem to do some sort of interval overlap within a window of +/- 200bp around a genomic position. I highly recommend shifting your analysis to GenomicRanges to do this efficiently and correctly.

ADD COMMENT • link 23 months ago by Michael 54k

0

Entering edit mode

Thank you for your reply. I hope to make a 200bp distribution density map of M6A above and below APA site. 'n' includes the location of APA site. M6a_seq is a Merip acquired bed file containing CHR,start,end, and value. I'm going to find every row in m6a_seq that is 200bp away from APA and do the following

ADD REPLY • link 23 months ago by 3095916029 • 0

0

Entering edit mode

Ok, please see my updated answer. I highly recommend doing this by the GenomicRanges package and its interval overlap methods if you want to perform the analysis in R. Also, when importing data, make sure numeric columns are imported as such and not converted to factors, which may happen sometimes.

ADD REPLY • link 23 months ago by Michael 54k

0

Entering edit mode

But, when i do this,

for ( i in n ) {print(i)}

I still get numeric in 'n', not others. enter image description here

ADD REPLY • link 23 months ago by 3095916029 • 0

0

Entering edit mode

But that's what you wanted, isn't it?

ADD REPLY • link 23 months ago by Michael 54k

0

Entering edit mode

yes , i need it. but when i do the loop, it's still use 1, 2, 3, not these.

ADD REPLY • link 23 months ago by 3095916029 • 0

0

Entering edit mode

I don't see what could cause this because I have seen only fragments of your code. Anyway, simply scrap it and use GenomicRanges instead, then you need no loops.

ADD REPLY • link 23 months ago by Michael 54k

0

Entering edit mode

Hahaha, I think I need to try it. Thank you again, I am still a novice in bioinformatics, I still have a lot to learn, this is my first time to ask questions on this, thank you for giving me a good experience.

ADD REPLY • link 23 months ago by 3095916029 • 0

1

Entering edit mode

You are welcome, try stay on the well-trodden path (like using existing package), if you can, at least in the beginning. I think even seasoned programmers and bioinformaticians can and will make terrible mistakes when trying to code things from scratch. So, the best code is that that I didn't write at all.

ADD REPLY • link 23 months ago by Michael 54k