Question

merge regions into one bigger region

0

Entering edit mode

8 months ago

Chironex ▴ 50

Hi, I have a list of regions and I would to merge the adiacent-similar regions into a bigger common region that include all of them:

print(selected_table, n = 30)
# A tibble: 271,961 × 4
   chr     start     end type 
   <chr>   <dbl>   <dbl> <chr>
 1 chr1  3119617 3119911 dELS 
 2 chr1  3119914 3120119 dELS 
 3 chr1  3120346 3120662 dELS 
 4 chr1  3445885 3446198 pELS 
 5 chr1  3451695 3451984 dELS 
 6 chr1  3671056 3671275 dELS 
 7 chr1  3671632 3671982 dELS 
 8 chr1  4426742 4427077 dELS 
 9 chr1  4427313 4427595 dELS 
10 chr1  4492814 4493110 dELS 
11 chr1  4493488 4493815 dELS 
12 chr1  4496403 4496645 dELS 
13 chr1  4497406 4497654 dELS 
14 chr1  4544065 4544410 dELS 
15 chr1  4571285 4571565 dELS 
16 chr1  4572346 4572692 dELS 
17 chr1  4613338 4613673 dELS 
18 chr1  4617524 4617855 dELS 
19 chr1  4622424 4622768 dELS 
20 chr1  4654054 4654327 dELS 
21 chr1  4655294 4655592 pELS 
22 chr1  4671326 4671550 dELS 
23 chr1  4671654 4671956 dELS 
24 chr1  4706158 4706508 dELS 
25 chr1  4784168 4784518 dELS 
26 chr1  4785232 4785421 dELS

What do you suggest, should I use Genomicranges::reduce() I am not sure what is the min.gaplength to set. I mean, there are some adiacent regions but maybe in some of them the distance between is bigger than in others. Any suggestions?

genomicranges • 621 views

ADD COMMENT • link 8 months ago by Chironex ▴ 50

score 0 · Answer 1 · 2024-02-24

0

Entering edit mode

8 months ago

zau saa ▴ 150

I would plot the distribution of the distance between adjacent intervals for further analysis.

ADD COMMENT • link 8 months ago by zau saa ▴ 150

0

Entering edit mode

Hi, could you explain how to do that?

ADD REPLY • link 8 months ago by Chironex ▴ 50

0

Entering edit mode

plot the distance distribution and try to find a proper theresold to categorise distances into short and long and remove most short gaps.