Split bed file per sequence length
1
0
Entering edit mode
9.0 years ago
cg2827 • 0

I need to split my bed file into files with the same sequence length, but my original input file is not the whole chromosome, but a list of annotations with variable lengths and gaps between them. BedTools windowMaker will split the fragments into the requested windows size only if the original fragment is larger than the window, but in my case it does not work as I want.

For instance, suppose I have as an input the following:

chr1    0    90
chr1    149    200
chr1    249    300
chr1    310    510

And want a bed files with 100bp such as

File 1:

chr1    0     90
chr1   149  159

File 2:

chr1   159   200
chr1   249   300
chr1   310   318

And so on...

Or something like:

chr1    0     90     block1
chr1   149  159   block1
chr1   159   200  block2
chr1   249   300  block2
chr1   310   318  block2

Bedtools outputs this instead:

chr1    0    90
chr1    149    200
chr1    249    300
chr1    310    410
chr1    410    510

Is there a way to define the output based on sequence length instead of windows?

bed • 3.8k views
ADD COMMENT
0
Entering edit mode

Sorry, but I don't get what you want to achieve. In file1 there are coordinates from 0 to 159 and there lengths are 90,10. In file2 coordinates are from 159 to 318 and lengths are: 41,51,8. So where is this "100bp"?

ADD REPLY
0
Entering edit mode

90+10=100

41+51+8=100

What I want to achieve is to have always 100bp in each file, whatever the number of line is.

ADD REPLY
0
Entering edit mode
9.0 years ago

You could write a script in awk or Perl to do this pretty easily. Just read the start and stop positions, track how many bases are "left" in a current window/"block" to print out, and loop through all your elements until there are none left.

ADD COMMENT
0
Entering edit mode

Thanks Alex. I thought about using bedtools to make windows of 1bp and then split the files according to the number of sites I want per output file - say 100 lines (=bp) according to the example - but this might be cumbersome. Do you envision something more straightforward?

ADD REPLY

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6