Question

.mcool to .hic

4

Entering edit mode

5.3 years ago

dimitrischat ▴ 210

Hi. I am trying to work out a pipeline for hi-c data analysis. https://hms-dbmi.github.io/hic-data-analysis-bootcamp/#1

Though i would prefer to use Juicebox for visualization. I am at the point where i have .mcool file (cooler) and i need .hic file so i can load it in Juicebox. Anyone knows how?

I tried using the Higlass web based app but it need .json files.. i have no idea

RNA-Seq • 13k views

ADD COMMENT • link updated 8 months ago by Ben B ▴ 50 • written 5.3 years ago by dimitrischat ▴ 210

1

Entering edit mode

I am one of the main developers of HiGlass, and I agree that our setup could and should be easier. Where specifically do you get stuck? Also, feel free to join our slack channel (http://bit.ly/higlass-slack) if you need quick help.

ADD REPLY • link 5.2 years ago by F Lekschas ▴ 10

0

Entering edit mode

maybe you can try pairLiftOver https://github.com/XiaoTaoWang/pairLiftOver

ADD REPLY • link 2.1 years ago by luzhang • 0

score 6 · Answer 1 · 2021-02-24

I've put together a slightly hacky and long winded way to go from .cool to .hic

Firstly, use the hicConvertFormat tool from the HiCExplorer package (which I installed using conda) to convert your .cool file into a ginteractions file:

hicConvertFormat -m /path/to/file.cool -o /path/to/file.ginteractions --inputFormat cool --outputFormat ginteractions

This will create the file file.ginteractions.tsv. Next, we will do some format preparation on this file to make it compatible with juicer pre. Documentation on the input formats accepted can be found here: https://github.com/aidenlab/juicer/wiki/Pre . I opted to make the file into short format with score, which has columns like:

<str1> <chr1> <pos1> <frag1> <str2> <chr2> <pos2> <frag2> <score>

The ginteractions file does not contain fragment or strand information, so I put dummy variables for those (since they are not used for the conversion to .hic anyway) and made sure that the dummy variables for frag1 and frag2 were different, using the following awk command:

awk -F "\t" '{print 0, $1, $2, 0, 0, $4, $5, 1, $7}' file.ginteractions.tsv > file.ginteractions.tsv.short

Sometimes this file will need to be sorted as juicer requires a specific chromosome ordering. So you can run:

sort -k2,2d -k6,6d file.ginteractions.tsv.short > file.ginteractions.tsv.short.sorted

I downloaded juicer tools from here: https://github.com/aidenlab/juicer/wiki/Download and set the following alias. However, you may need to increase the resourced allocated to the JVM for very large files:

alias juicer='java -Xms512m -Xmx2048m -jar path/to/juicer_tools_1.22.01.jar'

So that converting the short format with score file is done with the following:

juicer pre -r 10000,20000,50000,100000,250000,500000,1000000 /path/to/file.ginteractions.tsv.short.sorted /path/to/file.ginteractions.tsv.short.sorted.hic /path/to/chrom.sizes

Where the chrom.sizes file contains two columns: <chrom> <chrom size>. The -r flag here specifies the resolutions you would like your .hic file to include.

score 5 · Answer 2 · 2019-01-24

You can load your mcool files in your own HiGlass server that you can launch on your local machine using higlass-manage as explained in their documentation:

http://higlass.io/docs

https://github.com/higlass/higlass-manage

I agree that it is not easy to setup but it is worth the troubles.

For .mcool to .hic, the other way around is supported by hic2cool. I don't think there is an official way of converting .cool or .mcool to .hic right now... which is very unfortunate. So, sadly enough you have to start everything from scratch using Juicer if you want to use their visualisation tools. Maybe you can come up with using your mapped reads and their Pre tool:

https://github.com/aidenlab/juicer/wiki/Pre#file-format

People are switching to HiGlass to visualize Hi-C data currently anyway it seems... the setup is more complicated than Juicebox but the outcome seems better in certain ways.

As an alternative you can use HiCBrowser to directly visualise cool files, you can have an example here:

http://chorogenome.ie-freiburg.mpg.de/

For further visualisation and data integration to make figures ready to publish you can use pyGenomeTracks (although the views are static):

https://github.com/deeptools/pyGenomeTracks

For long range interactions, pyGenomeTracks and HiCBrowser are not the best to use, HiGlass and Juicer are better for that. If you don't mind static views you can also use hicPlotMatrix from HiCExplorer, which is similar to cooler show:

https://hicexplorer.readthedocs.io/en/latest/content/tools/hicPlotMatrix.html#hicplotmatrix

score 5 · Answer 3 · 2023-08-23

I know this is an older post by now but I recently had to convert .mcool back to .hic and found it to be kind of a pain, so I wanted to provide my script in case it can be helpful to anyone else here. This extends Charlotte's answer to .mcools, which need to be handled a bit differently from cools.

Instead of HiCExplorer, I'm using Cooler to extract the interactions from the .mcool at the highest resolution available, then converting those interactions to .hic with JuicerTools.

#!/bin/bash

# Set the path to the input .mcool file
input_mcool=$1

# Set the path to the output .hic file
output_hic=${input_mcool%.*}.hic

# Set the path to the chrom.sizes file
chrom_sizes=~/ref_annots/hg19.chrom.sizes

# Set the path to the juicer_tools jar file
juicer_tools_jar=juicer_tools_1.22.01.jar


# Get the resolutions stored in the .mcool file
resolutions=$(h5ls -r $input_mcool | grep -Eo 'resolutions/[0-9]+' | cut -d '/' -f 2 | sort -n | uniq)
echo $resolutions
highest_res=$(echo $resolutions | tr ' ' '\n' | head -n 1)
echo "highest resolution: $highest_res"

# Use Cooler to write the .mcool matrix as interactions in bedpe format
output_bedpe=$(echo $input_mcool | sed "s/.mcool/.${highest_res}.bedpe/")
echo -e "cooler dump --join -r $highest_res $input_mcool::/resolutions/$highest_res"
cooler dump --join $input_mcool::/resolutions/$highest_res > $output_bedpe

# Convert the ginteractions file to short format with score using awk
awk -F "\t" '{print 0, $1, $2, 0, 0, $4, $5, 1, $7}' ${output_bedpe} > ${output_bedpe}.short

# Sort the short format with score file
sort -k2,2d -k6,6d ${output_bedpe}.short > ${output_bedpe}.short.sorted

# Convert the short format with score file to .hic using juicer pre
java -Xms512m -Xmx2048m -jar $juicer_tools_jar pre -r 1000,2000,5000,10000,20000,50000,100000,250000,500000,1000000 ${output_bedpe}.short.sorted $output_hic $chrom_sizes

score 2 · Answer 4 · 2022-07-04

2

Entering edit mode

21 months ago

zsq.phy ▴ 20

I wrote a simple script cool2hic.py for transfering cooler to hic file.

https://github.com/zsq-berry/3D-genome-tools

ADD COMMENT • link 20 months ago by zsq.phy ▴ 20