Question: Expanding Genomic Coordinates Using Bedtools
0
gravatar for gtasource
12 months ago by
gtasource20
gtasource20 wrote:

I have the following bed file

chr1    18551   18579   
chr1    18559   18583   
chr1    18966   18991   
chr1    18966   18991   
chr1    18966   18993

I want to expand the coordinates to generate a new file that lists every single coordinate individually. For example:

chr1 18551 18552
chr1 18552 18553
...
chr1 18578 18579

I want it to do this for every single bed coordinate, and put all if it in a single file. Any help? I've played around with Bedtools slop, but it wasn't doing exaclty what I needed.

ADD COMMENTlink modified 11 months ago by Aaronquinlan11k • written 12 months ago by gtasource20
1
gravatar for Alex Reynolds
12 months ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

You could --merge and --chop intervals from one to N sorted BED files into single-base intervals with BEDOPS bedops and a Unix pipe:

$ bedops --merge A.bed ... N.bed | bedops --chop 1 - > answer.bed

The merge step merges overlapping intervals before chopping. This removes duplicates.

If you instead want duplicate single-base intervals where there are overlaps, just replace the merge operation with a union operation:

$ bedops --everything A.bed ... N.bed | bedops --chop 1 - > answer.bed
ADD COMMENTlink modified 12 months ago • written 12 months ago by Alex Reynolds28k
1
gravatar for Aaronquinlan
11 months ago by
Aaronquinlan11k
United States
Aaronquinlan11k wrote:
bedtools makewindows -b in.bed -w 1
ADD COMMENTlink written 11 months ago by Aaronquinlan11k
0
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:
 awk  '{S=int($2);E=int($3);while(S<E) {printf("%s\t%d\t%s\n",$1,S,S+1);S++}}'   in.bed
ADD COMMENTlink written 12 months ago by Pierre Lindenbaum120k
0
gravatar for arup
12 months ago by
arup1.4k
India
arup1.4k wrote:
import sys 
with open(sys.argv[1],"r")as bed: 
        for line in bed: 
                #cor=line.strip("\n").split("   ") 
                cor=line.strip("\n").split("\t") 
                for i in range(int(cor[1]),int(cor[2])): 
                        print(cor[0],i,i+1,sep="\t")

Save this script as biostars.py and run python biostars.py input.bed .

ADD COMMENTlink modified 12 months ago • written 12 months ago by arup1.4k
0
gravatar for venu
12 months ago by
venu6.2k
Germany
venu6.2k wrote:

There are many working solutions. But as you mentioned bedtools, here is how you do it

cat file.bed | windowMaker -b - -w 1
ADD COMMENTlink written 12 months ago by venu6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 572 users visited in the last hour