.mcool to .hic
2
1
Entering edit mode
2.4 years ago
dimitrischat ▴ 140

Hi. I am trying to work out a pipeline for hi-c data analysis. https://hms-dbmi.github.io/hic-data-analysis-bootcamp/#1

Though i would prefer to use Juicebox for visualization. I am at the point where i have .mcool file (cooler) and i need .hic file so i can load it in Juicebox. Anyone knows how?

I tried using the Higlass web based app but it need .json files.. i have no idea

RNA-Seq • 3.0k views
1
Entering edit mode

I am one of the main developers of HiGlass, and I agree that our setup could and should be easier. Where specifically do you get stuck? Also, feel free to join our slack channel (http://bit.ly/higlass-slack) if you need quick help.

3
Entering edit mode
2.4 years ago

You can load your mcool files in your own HiGlass server that you can launch on your local machine using higlass-manage as explained in their documentation:

http://higlass.io/docs

https://github.com/higlass/higlass-manage

I agree that it is not easy to setup but it is worth the troubles.

For .mcool to .hic, the other way around is supported by hic2cool. I don't think there is an official way of converting .cool or .mcool to .hic right now... which is very unfortunate. So, sadly enough you have to start everything from scratch using Juicer if you want to use their visualisation tools. Maybe you can come up with using your mapped reads and their Pre tool:

https://github.com/aidenlab/juicer/wiki/Pre#file-format

People are switching to HiGlass to visualize Hi-C data currently anyway it seems... the setup is more complicated than Juicebox but the outcome seems better in certain ways.

As an alternative you can use HiCBrowser to directly visualise cool files, you can have an example here:

http://chorogenome.ie-freiburg.mpg.de/

For further visualisation and data integration to make figures ready to publish you can use pyGenomeTracks (although the views are static):

https://github.com/deeptools/pyGenomeTracks

For long range interactions, pyGenomeTracks and HiCBrowser are not the best to use, HiGlass and Juicer are better for that. If you don't mind static views you can also use hicPlotMatrix from HiCExplorer, which is similar to cooler show:

3
Entering edit mode
3 months ago

I've put together a slightly hacky and long winded way to go from .cool to .hic

Firstly, use the hicConvertFormat tool from the HiCExplorer package (which I installed using conda) to convert your .cool file into a ginteractions file:

hicConvertFormat -m /path/to/file.cool -o /path/to/file.ginteractions --inputFormat cool --outputFormat ginteractions

This will create the file file.ginteractions.tsv. Next, we will do some format preparation on this file to make it compatible with juicer pre. Documentation on the input formats accepted can be found here: https://github.com/aidenlab/juicer/wiki/Pre . I opted to make the file into short format with score, which has columns like:

<str1> <chr1> <pos1> <frag1> <str2> <chr2> <pos2> <frag2> <score>

The ginteractions file does not contain fragment or strand information, so I put dummy variables for those (since they are not used for the conversion to .hic anyway) and made sure that the dummy variables for frag1 and frag2 were different, using the following awk command:

awk -F "\t" '{print 0, $1,$2, 0, 0, $4,$5, 1, \$7}' file.ginteractions.tsv > file.ginteractions.tsv.short

Sometimes this file will need to be sorted as juicer requires a specific chromosome ordering. So you can run:

sort -k2,2d -k6,6d file.ginteractions.tsv.short > file.ginteractions.tsv.short.sorted

I downloaded juicer tools from here: https://github.com/aidenlab/juicer/wiki/Download and set the following alias. However, you may need to increase the resourced allocated to the JVM for very large files:

alias juicer='java -Xms512m -Xmx2048m -jar path/to/juicer_tools_1.22.01.jar'

So that converting the short format with score file is done with the following:

juicer pre -r 10000,20000,50000,100000,250000,500000,1000000 /path/to/file.ginteractions.tsv.short.sorted /path/to/file.ginteractions.tsv.short.sorted.hic /path/to/chrom.sizes

Where the chrom.sizes file contains two columns: <chrom> <chrom size>. The -r flag here specifies the resolutions you would like your .hic file to include.

0
Entering edit mode

Hi, it reports the error: """ ..Error: the chromosome combination 1_2 appears in multiple blocks """

0
Entering edit mode

Hi, yes sorry sometimes the file needs sorting first as seen here https://groups.google.com/g/3d-genomics/c/2w1OGHo5XdM . I will adjust my answer accordingly