Question: Expanding Genomic Coordinates Using Bedtools
0
gravatar for gtasource
18 months ago by
gtasource30
gtasource30 wrote:

I have the following bed file

chr1    18551   18579   
chr1    18559   18583   
chr1    18966   18991   
chr1    18966   18991   
chr1    18966   18993

I want to expand the coordinates to generate a new file that lists every single coordinate individually. For example:

chr1 18551 18552
chr1 18552 18553
...
chr1 18578 18579

I want it to do this for every single bed coordinate, and put all if it in a single file. Any help? I've played around with Bedtools slop, but it wasn't doing exaclty what I needed.

ADD COMMENTlink modified 17 months ago by Aaronquinlan11k • written 18 months ago by gtasource30
1
gravatar for Alex Reynolds
18 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

You could --merge and --chop intervals from one to N sorted BED files into single-base intervals with BEDOPS bedops and a Unix pipe:

$ bedops --merge A.bed ... N.bed | bedops --chop 1 - > answer.bed

The merge step merges overlapping intervals before chopping. This removes duplicates.

If you instead want duplicate single-base intervals where there are overlaps, just replace the merge operation with a union operation:

$ bedops --everything A.bed ... N.bed | bedops --chop 1 - > answer.bed
ADD COMMENTlink modified 18 months ago • written 18 months ago by Alex Reynolds29k
1
gravatar for Aaronquinlan
17 months ago by
Aaronquinlan11k
United States
Aaronquinlan11k wrote:
bedtools makewindows -b in.bed -w 1
ADD COMMENTlink written 17 months ago by Aaronquinlan11k
0
gravatar for Pierre Lindenbaum
18 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:
 awk  '{S=int($2);E=int($3);while(S<E) {printf("%s\t%d\t%s\n",$1,S,S+1);S++}}'   in.bed
ADD COMMENTlink written 18 months ago by Pierre Lindenbaum124k
0
gravatar for arup
18 months ago by
arup1.9k
India
arup1.9k wrote:
import sys 
with open(sys.argv[1],"r")as bed: 
        for line in bed: 
                #cor=line.strip("\n").split("   ") 
                cor=line.strip("\n").split("\t") 
                for i in range(int(cor[1]),int(cor[2])): 
                        print(cor[0],i,i+1,sep="\t")

Save this script as biostars.py and run python biostars.py input.bed .

ADD COMMENTlink modified 18 months ago • written 18 months ago by arup1.9k
0
gravatar for venu
18 months ago by
venu6.3k
Germany
venu6.3k wrote:

There are many working solutions. But as you mentioned bedtools, here is how you do it

cat file.bed | windowMaker -b - -w 1
ADD COMMENTlink written 18 months ago by venu6.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1620 users visited in the last hour