Question: smoothing or binning bigWig file
gravatar for igor
3.7 years ago by
United States
igor11k wrote:

If I have a bigWig file, is there a simple way to group values by bin or perform smoothing?

For example, deepTools bamCoverage can create bigWigs with a specific bin size and smoothing length. It's a great one-line solution, but it needs BAM as an input.

The best option I can think of is converting bigWig to bedGraph. Then it becomes a tab-delimited table and you can write a script to modify it as you wish, but that seems somewhat convoluted. Seems dumb to have a custom script if there is already an established way of doing it.

bigwig • 4.5k views
ADD COMMENTlink modified 18 months ago by keller.mckowen10 • written 3.7 years ago by igor11k

Hey. Thanks to all for this post. iamjli: I was inspired by your advice to try this with deepTools, but I ended up using bigwigCompare because the output can be in bigwig, rather than multiBigWigSummary which outputs npz or a tab file. I just used the "mean" operation on the same bigwig file as my -b1 and -b2, then I used the -binSize option to so the smoothing. I was using chip-chip data with info every 50bp so I used a binsize of 150bp to do the smoothing.

ADD REPLYlink written 18 months ago by keller.mckowen10
gravatar for Alex Reynolds
3.7 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

Via a couple Kent tools and BEDOPS, convert the bigWig file to sorted BED:

$ bigWigToBedGraph input.bedgraph
$ awk '{ print $1"\t"$2"\t"$3"\tid-"NR"\t"$4; }' input.bedgraph | sort-bed - > input.bed

Get the chromosomal bounds for your genome build of interest, e.g. hg19, and convert them into sorted BED:

$ fetchChromSizes hg19 > hg19.bounds.unsorted.txt
$ awk '{ print $1"\t0\t"$2; } hg19.bounds.unsorted.txt | sort-bed - > hg19.bounds.bed

Once you prepare your inputs as BED files, you can use a one-liner to measure signal in sliding windows or bins.

For example, use bedops --chop and --stagger to split up the bounds into, say, 1000 base increments, staggered every 100 bases — basically a sliding window 1000 bases wide, that is positioned at every 100 bases. Pipe this to bedmap to map against the signal converted bigWig file, taking the mean signal over the split windows:

$ bedops --chop 1000 --stagger 100 hg19.bounds.bed | bedmap --faster --echo --mean --delim "\t" --skip-unmapped - input.bed > answer.bed

You can put in whatever values you want for --chop and --stagger to decide how finely- or coarsely-grained you want to smooth the signal. For instance, a --stagger value of 0 (or leaving out this option) would change the analysis from a sliding window to measuring signal over disjoint bins.

You can use other measurements than --mean. See bedmap --help or take a look at the documentation for a description of all the signal- or score-based operands.

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Alex Reynolds30k
gravatar for iamjli
18 months ago by
iamjli10 wrote:

A little late, but what about multiBigWigSummary? I don't see why it wouldn't work with just one input file.

ADD COMMENTlink written 18 months ago by iamjli10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1605 users visited in the last hour