Hi, I'm using Bedtools for R (bedr) in Ubuntu and I am trying to merge genomic regions that are within 100 bp of one another. For example, I have a data frame (df) with 3274 obs of 3 variables and the first rows look like:df
and when I merge using:
> df1 <- bedr.merge.region(df.sorted, distance = 100, number = TRUE, check.zero.based = TRUE, check.chr = TRUE, check.valid = TRUE, check.sort = TRUE)
I get the data frame (df1):df1
My goal is to merge all coordinates that are within 100 bp (distance=100) so from data frame df it should've merged the 2 first rows together and then the 2 last together since there's less than 100 bp between the start of row 1 (on df) and end of row 2 (on df), not the 4 (as if shows in df1), since that gives a distance of 156 bp (1215038 - 1214882 = 156),
Any help as to why the parameter "distance = 100" is not merging only regions within 100 bp and it merges regions at 156 bp? The goal is to be able to design probes for wet lab to capture regions of interest but our probes are limited to 100 bp so I want to see how many probes of 100 bp I would need to build to capture all regions and what would their coordinates be.
Thank you Joana
To clarify, my expected output from the code is (df2)df2