Hi everyone,
I have two unequal data sets.
A data set
V1     V2          V3
6   42721754    42721769
6   42721757    42721772
6   42721760    42721775
6   42721763    42721778
6   42721766    42721781
6   42721769    42721784
6   42721772    42721787
6   42721775    42721790
B data set
V2              AF
42721757    0.003067485
42721760    0.006134969
42721763    0.006134969
42721766    0.003067485
42721769    0.006134969
42721772    0.006134969
42721775    0.003067485
42721778    0.006134969
42721781    0.003067485
42721784    0.003067485
42721787    0.009202454
42721790    0.009202454
I want to check if  the value of V2 in B data set  falls between the values of V2 and V3 of A data set. And when its true, I want value of AF in B data set to be added as a new column in A dataset. 
There are two important points.
1. B data set is 80 rows while A data set is 6000 rows.
2. value of V2 in B may be repeated,  which shouldn't get thrown out in final output.
an urgent help with be appreciated.
Thanks,
If you can convert "B data set" as bed file, you could use bedtools intersect.
Thanks Rashedul, however bedtools might not work because of redundancy.
Thanks Kevin,
That was quick. I tried but it didn't work for me, that's weird. I have exact same data sets and i copied your code. It didn't reproduce the product you have shown. Can you please take another look? Your all assumptions are correct except one thing I would like to clarify that I also want to use 'less than or equal to' and 'more than or equal to' options here. So, will
>=and<=work?Thanks a bunch.
Hey, which system are you using? - Mac? Are you sure that your files are tab-delimited
I am using RedHat Enterprise Linux 7 and yes the files are tab -delimited. Will it work on mac? Thanks for the help again. Really appreciate and looking forward to solve this problem.
Hey Kevin, I am trying to do it in R as well but I think there is a common problem which is causing this not to work with real dataset. My apologies. With example data it works on Mac but its not working with real dataset. I realized that the real dataset has many regions in
A.txtwhich doesn't overlap with theB.txt. Please note that in real dataset,A.txthas about 10,000 rows whileB.txthas about 80 rows. Can we introduce NA or zero for the entries which are present inA.txtbut absent inB.txt? Thanks for your help.You just accepted our answers? - is the issue resolved?
I accepted answer for the example data given. However, I am still struggling with the real data.
Feel free to ask a new question with example data that is more representative to your real data, and provide expected output.
This is called "merge on overlap", see this related StackOverflow post: