Question

How to keep information from both files when I intersect them?

1

Entering edit mode

7.5 years ago

Sinji ★ 3.2k

I'm using bedops to intersect bed and bedgraph files and want to keep information in both files. Currently bedops only outputs the information on the first file that it is fed.

bedops --element-of $bed $bdg > $overlapped.bed

File A: BED

chr1 1000 2000 region_1
chr1 2000 3000 region_2

File B: BDG

chr1 1000 1500 100
chr1 2000 3000 200

Wanted output:

chr1 1000 2000 region_1 chr1 1000 1500 100
chr1 2000-3000 region_2 chr1 2000 3000 200

Any ideas?

bedops • 4.2k views

ADD COMMENT • link updated 7.5 years ago by ssv.bio ▴ 200 • written 7.5 years ago by Sinji ★ 3.2k

4

Entering edit mode

Sure, use bedmap, instead:

$ bedmap --echo --echo-map --delim '\t' A.bed B.bed > answer.bed

The --echo operator prints out each element in set A.
The --echo-map operator prints any elements in set B ("map set") that overlap set A's element, by one or more bases.
The --delim '\t' operator separates set A elements and any overlapping elements by a tab character.

The bedmap tool maps or reports associations between sets of elements, based on how they overlap.

You can report the associations directly as elements, as shown above with --echo-map, or you can even report attributes or summaries of the associations, like statistical or score-based calculations (mean, median, etc.) using other operators (*).

The documentation shows examples of this for various BED inputs.

(*) Note: For score calculations, Bedgraph needs to be converted to BED5, but that's easy to do:

$ awk '{print $1"\t"$2"\t"$3"\t.\t"$4;}' in.bedgraph > out.bed

If you're just looking at mapped elements, you don't need to convert Bedgraph to BED.

ADD REPLY • link 7.5 years ago by Alex Reynolds 35k

0

Entering edit mode

Answered everything, even anticipated the question about the scoring calculations I was going to have after running it. Thanks Alex!

ADD REPLY • link 7.5 years ago by Sinji ★ 3.2k

0

Entering edit mode

You're welcome! If BEDOPS is useful to you, please consider adding a star to our Github page. This helps us gauge interest and make future updates more easily available.

ADD REPLY • link 7.5 years ago by Alex Reynolds 35k

0

Entering edit mode

anyone know how to do this in perl?...just by chance this is very like something i need.

ADD REPLY • link 7.5 years ago by BioinfGuru ★ 1.7k

1

Entering edit mode

You could use system():

system("bedmap --echo --echo-map --delim '\t' A.bed B.bed > answer.bed") == 0 or die "could not run bedmap for some reason\n";

ADD REPLY • link 7.5 years ago by Alex Reynolds 35k

score 1 · Answer 1 · 2016-10-11

1

Entering edit mode

7.5 years ago

Eric Lim ★ 2.1k

bedtools intersect -b a.bed -a b.bed -wb should do it, though you might need to reorder the columns for your wanted output.

ADD COMMENT • link 7.5 years ago by Eric Lim ★ 2.1k

1

Entering edit mode

The re-ordering is kind of why I wanted to avoid bedtools and stick to bedops. I don't want to be that guy that gets a paper retracted due to incorrect data cleaning.

ADD REPLY • link 7.5 years ago by Sinji ★ 3.2k

score 0 · Answer 2 · 2016-10-11

0

Entering edit mode

7.5 years ago

ssv.bio ▴ 200

Relying on example only, cbind In R should do the trick.

ADD COMMENT • link 7.5 years ago by ssv.bio ▴ 200

0

Entering edit mode

cbind assumes that the columns are in some sort of order, which I can't guarantee.

ADD REPLY • link 7.5 years ago by Sinji ★ 3.2k

0

Entering edit mode

Then, bed tools is simplest, to use for such intersection. Intersect function in genomicranges package in R, also does this kind of intersection.

ADD REPLY • link 7.5 years ago by ssv.bio ▴ 200