Using bioconda, you can install everything you need for this example with
conda install --channel bioconda pybedtools bedtools htslib matplotlib
Run get-data.sh
to download data from ENCODE.
bash get-data.sh
Input files have the following format (UCSC broadPeak and narrowPeak formats, which ar variants of BED format):
chr1 569797 570055 . 1000 . 38.118451 16.0 -1
chr1 724125 2647713 . 258 . 1.259053 11.2 -1
chr1 752542 752779 . 658 . 10.178273 1.9 -1
Then run binary_heatmaps.py
to generate the plot, a summary file, and a
directory of interval files for each class.
python binary_heatmaps.py
Rows are genomic intervals (as output by bedtools multiinter
); columns are
input BED files; black indicates that factor was found in that genomic
interval.
Summary of how many genomic intervals for each combinatorial class:
LSD1: 16181
LSD1,TAL1: 15120
TAL1: 7989
GATA1,LSD1,TAL1: 3009
GATA1,LSD1: 654
GATA1: 231
GATA1,TAL1: 214
For each of the above classes, a BED file of the indicated intervals. For example,
track name="LSD1_and_TAL1"
chr1 778211 778487
chr1 854053 854329
chr1 948500 948776
...
Sinji;
Thank you for the reply. I have experienced both of the programs and I can easily say that if you are comparing a lot of beds, HOMER is way better. Its output consisted of variety of information such as unique peaks which occur in single samples to peaks that occur in all the samples.
Give it a try.
Best,
Tunc.