I am trying to merge a BED file containing all repeat-masked positions of a genome, however that gives me unexpected results.
For example I have a sorted unmerged BED file;
Nitab4.5_0000001 1 383
Nitab4.5_0000001 384 384
Nitab4.5_0000001 385 385
Nitab4.5_0000001 386 387
Nitab4.5_0000001 388 388
Nitab4.5_0000001 389 389
Nitab4.5_0000001 390 390
Nitab4.5_0000001 391 395
Nitab4.5_0000001 396 402
Nitab4.5_0000001 403 404
The merged BED file is;
Nitab4.5_0000001 1 395
Nitab4.5_0000001 396 402
Nitab4.5_0000001 403 404
I don't understand why this isn't a single feature? Furthermore I also get 0-based coordinates while these are not in the sorted BED file.
The sorted BED file;
Nitab4.5_0000003 1 1
Nitab4.5_0000003 2 2
Nitab4.5_0000003 3 4
Nitab4.5_0000003 5 9
Nitab4.5_0000003 10 11
Nitab4.5_0000003 12 16
Nitab4.5_0000003 17 24
Nitab4.5_0000003 25 28
Nitab4.5_0000003 29 73
Nitab4.5_0000003 74 90
......
The merged BED file;
Nitab4.5_0000003 0 4
Nitab4.5_0000003 5 9
Nitab4.5_0000003 10 11
Nitab4.5_0000003 12 16
Nitab4.5_0000003 17 24
Nitab4.5_0000003 25 28
Nitab4.5_0000003 29 73
Nitab4.5_0000003 74 90
Nitab4.5_0000003 91 213
Nitab4.5_0000003 214 221
Anyone who could help me with this?