Question: Pulling out interval adjoining regions
0
gravatar for rbronste
6 months ago by
rbronste240
rbronste240 wrote:

Looking for a good way to take a set of intervals and print out an interval set (bed file) that represents regions just upstream and downstream of every interval in the original file, lets say 10kb up and downstream. Any help appreciated, thanks!

interval bedops bed bedtools • 255 views
ADD COMMENTlink modified 6 months ago by Alex Reynolds28k • written 6 months ago by rbronste240
3
gravatar for Damian Kao
6 months ago by
Damian Kao15k
USA
Damian Kao15k wrote:

As with most interval operations, bedtools has a command for it:

https://bedtools.readthedocs.io/en/latest/content/tools/flank.html

ADD COMMENTlink written 6 months ago by Damian Kao15k

I didn't know that one, thanks !

ADD REPLYlink written 6 months ago by Pierre Lindenbaum120k
1
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:
 awk -F '\t' '{X=10000; B=int($2);E=int($3);printf("%s\t%d\t%d\n%s\t%d\t%d\n",$1,B-X<0?0:B-X,B,$1,E,E+X);}'
ADD COMMENTlink written 6 months ago by Pierre Lindenbaum120k
1
gravatar for bernatgel
6 months ago by
bernatgel1.9k
Barcelona, Spain
bernatgel1.9k wrote:

If you are using R, you can do it with the flank function in GenomicRanges. It takes into account the chromosome lengths, if present.

https://bioconductor.org/packages/3.7/bioc/vignettes/GenomicRanges/inst/doc/GenomicRangesIntroduction.pdf

ADD COMMENTlink written 6 months ago by bernatgel1.9k
1
gravatar for Alex Reynolds
6 months ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

Yep, just use BEDOPS bedmap --range to map padded elements:

$ bedmap --skip-unmapped --echo-map --range 10000 reference.map map.bed | awk '(!a[$0]++)' | sort-bed - > answer.bed

We use awk to strip duplicates from unsorted results. Sorting is necessary because we use --echo-map, where mapped elements can be returned out of order.

The file answer.bed will contain unique elements from map.bed that overlap elements from a 10kb-padded version of reference.bed.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Alex Reynolds28k
1
gravatar for Alex Reynolds
6 months ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

Here's another approach that uses bedops --range:

$ bedops --merge reference.bed | bedops --range 10000 - | bedops --element-of 1 map.bed - > answer.bed

The file answer.bed will contain unique elements from map.bed that overlap elements from a 10kb-padded version of reference.bed. Adjust padding, as needed.

Merging the reference intervals before padding should handle overlaps, which avoids the need to filter duplicates and resort. So this should work faster than using bedmap --range, I think.

ADD COMMENTlink written 6 months ago by Alex Reynolds28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 735 users visited in the last hour