Question

How can I merge regions of bigWig files?

0

Entering edit mode

6.5 years ago

romgrk.cc • 0

Hello,

I am trying to merge specific regions of bigWig files into a single bigWig file. I know about bigWigMerge, which does the merge part, but how can I merge specific regions of bigWig files? Ideally I'd like it to be in a single step, but would it be necessary to do it in many steps? (e.g. extract region of file 1, extract region of file 2, merge extracted regions)

Thank you.

bigWig • 2.6k views

ADD COMMENT • link updated 6.5 years ago by Alex Reynolds 35k • written 6.5 years ago by romgrk.cc • 0

0

Entering edit mode

Solution 1: Create a new tool from Kent's source to do exactly that, called bigWigMergePlus. It's downloadable here, and the buildable source is here.

Solution 2: slice the bigWig through bigWigToWig, then convert them back using wigToBigWig, and finally merging them with bigWigMerge. The output is kept as a bedGraph instead of bigWig. Here is a NodeJS script to do all that in a single command: gist

ADD REPLY • link 5.4 years ago by romgrk.cc • 0

score 2 · Accepted Answer · 2017-10-12

2

Entering edit mode

6.5 years ago

Alex Reynolds 35k

Via Kent tools, convert from bigWig to WIG:

$ bigWigToWig signal.bw signal.wig

Via BEDOPS, convert signal from WIG to BED and filter for signal that overlaps each region of interest:

$ wig2bed < signal.wig | bedops --element-of 1 - regions_A.bed > signal_over_regions_A.bed
$ wig2bed < signal.wig | bedops --element-of 1 - regions_B.bed > signal_over_regions_B.bed
...
$ wig2bed < signal.wig | bedops --element-of 1 - regions_N.bed > signal_over_regions_N.bed

Get chromosome sizes with Kent tools, e.g. for hg38:

$ fetchChromSizes hg38 > hg38.sizes

Then convert each subset back to WIG or bigWig:

$ bedGraphToBigWig signal_over_regions_A.bed hg38.sizes signal_over_regions_A.bw
$ bedGraphToBigWig signal_over_regions_B.bed hg38.sizes signal_over_regions_B.bw
...
$ bedGraphToBigWig signal_over_regions_N.bed hg38.sizes signal_over_regions_N.bw

Then do your final merge step of bigWig files into one final bigWig product.

This process is repetitive enough to be put into a script, if it looks like a lot of typing.

ADD COMMENT • link 6.5 years ago by Alex Reynolds 35k

0

Entering edit mode

Thanks for the details, very precise!

One more thing, do you think it would be possible to write a program to merge the files rather than converting them to intermediary formats? Our use case is a bit different, we're trying to create a web interface where it would be possible to select regions of bigWig files and merge them on-the-fly, therefore we need the operation to complete in the smallest amount of time possible.

Any pointers on how to implement this would be welcome (python libs, specs, etc) :)

ADD REPLY • link 6.5 years ago by romgrk.cc • 0

0

Entering edit mode

This isn't particularly CPU-intensive work. It's fairly I/O bound. I'm not sure you're going to get around I/O limitations, but you could probably save some time by doing the wig2bed < signal.wig step once, and by replacing bedops --element-of 1 with bedextract, which works faster than bedops --element-of 1 if the signal.bed file is made up of disjoint intervals (which is likely, but something you need to verify).

Writing a web app is probably beyond the scope of this answer and this site. I'd perhaps suggest React for frontend work, and using nodejs for backend operations, but it's really up you and what you are familiar with. If you want to use Python, you might look into the subprocess library and the check_call() method: https://pymotw.com/2/subprocess/

ADD REPLY • link 6.5 years ago by Alex Reynolds 35k