Tutorial:Visualizing Chip-Seq Data Using Ucsc [Bigwig]
0
52
Entering edit mode
10.9 years ago

Hola!

I would like to write a quick tutorial on viewing your coverage/track/wiggle files using UCSC. I am assuming you know command line tools, Linux operations and have knowledge about analysing Chip-Seq data.

So, once you have called peaks or you have coverage files (bedGraph) from a Chip-Seq or RNA-Seq data, the next step is to visualize data in UCSC or IGV. To proceed, the coverage file should be converted to wig (wiggle plots) or a better way bigwig format. You can also go to UCSC and add custom track using this URL, upload your bedGraph file/ coverage file which is in bed format but represent your peaks. So, you can upload how many files you want and then view them (Limit per file ~ 1000Mb).

You can also make .wig files and upload them the same way. There are session track, if you reset UCSC session, they will be removed and can't be shared (unless you use the same computer)

We will talk about a better way, by converting the coverage or wig files to bigwig files which are kept on a local server and the data is fetched through the file via UCSC in the user specified range. It just pulls the view in the range specified by user. This makes it more fast, no need to upload files, easy sharing of tracks, tracking is easy and big files can be viewed >1 GB.

First step : Installing

Grab the tool from ucsc ftp, according to your choice of OS. The tool is called bedGraphToBigWig, if you want to convert a coverage file to bigwig else wigToBigWig. There are tools present there for back conversion as well like bigWigToWig, if you need them later. Also, fetch this utility called fetchChromSizes to get the chromosome size of your organism of interest.

Second step : Conversion

Usage : bedGraphToBigWig file.bed mm9.chromSizes file.bw

Output is a bigwig file

Now you have the bigwig file so lets upload it to ucsc. For that, just copy it on your local webserver, where you can get the link to the file like https://projects/files/file.bw. Open your favourite text editor and add the track lines as :

track type=bigWig name=proteinA smoothingWindow=4 color=123,100,50 autoScale=on viewLimits=1:200 visibility=full windowingFunction=maximum bigDataUrl=https://projects/files/file.bw


You can add multiple track lines for whatever samples you have. If you want to hide a specific track, just comment it out using '#'.

Now, name the file as bigwigCaller.txt and call it from UCSC as https://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&position=chr6:122657583-122663796&hgct_customText=https://projects/files/bigwigCaller.txt

So, I specified the organism as mm9 and position from chr6:122657583-122663796, you can anytime change this. This will be your default view. I have included other options as well which can be turned on/off at any time.

This link can be shared with anyone and multiple users can view the track at same time without destroying the original file. This can be password protected as well, you will then just have to supply it in URL which might be a security threat.

You can use url shortener to shorten the URL. The users/viewers can right click on the track in the UCSC and change the track properties (limiting bars, colour , smoothness etc), so it is customizable as well. Of course, you can import the bigwig in IGV as well, I have tested it works.

For some more automated users and users working in R, we can also upload the track directly from R using the package called rtracklayer

library(rtracklayer)
chip.tmp<-tempfile()
export(chip.cov,chip.tmp,"bedGraph")
restored.chip.track <- import(chip.tmp,"bedGraph",genome = "mm9")
session <- browserSession("UCSC")
track(session, "target") <- restored.chip.track
browserView(session,range(restored.chip.track))


Check the package manual for more parameters and explanation. The above code was used in the EuTRACC 2010 pipeline.

I hope that helps.

Cheers
Sukhdeep Singh

visualization bigwig chip-seq rna-seq • 41k views
0
Entering edit mode

Nice tutorial, great work.

0
Entering edit mode

Thanks Istvan :)

0
Entering edit mode

This is and excellent tutorial - even I got it working :) One question though: is it possible to upload more than one track? I did the simplest thing, add multiple line in the bigwigCaller.txt file, but only the first track is uploaded.

1
Entering edit mode

How to check bedGraphToBigWig file.bed mm9.chromSizes file.bw works well?

After I run bedGraphToBigWig, I got a special warning as the following:

bedGraphToBigWig NC-P-17.bedGraph_CpG.sort.bedGraph hg19.chrom.sizes NC-P-17.bw
Expecting 4 words line 1 of NC-P-17.bedGraph_CpG.sort.bedGraph got 5

0
Entering edit mode

Presumably it didn't like the track or browser line for some reason.

1
Entering edit mode

Thank you so much Devon. Eventually, I figure out the problems. sorting and column number, which I have noted them in your github. I think we can make PIleOMeth own bedGraphTobigwig and put it in your github. Now I am trying to do it with perl. When I finish it, I will send the script to you.

0
Entering edit mode

Yeah sure, it works with multiple tracks, just be sure that the name in the track line should be different, otherwise you will see only one track with that name :)

0
Entering edit mode

Thanks, I've realized soon after posting that I've made that silly mistake while copy-pasting :)

0
Entering edit mode

Very useful! Good work!

0
Entering edit mode

On question about the UCSC Visualizing: when we have several wig files which contain case and control, then, it is very easy to use to show the different regions. However, when the sample size or the files increased largely. How to use it? Suppose we have 200 wigs (100 case and 100 control)

Thanks.

2
Entering edit mode

As Devon said, it gets impractical while visualizing a large number of tracks in a genome browser. Few things you could do:

1. Overlay multiple tracks using track hubs
2. Merge replicates, if they are very similar in case you are comparing control and case.
3. Instead of full visualization, use dense/squish function from UCSC genome browser. It works fine for most comparisons, instead while comparing peak shapes and other small differences.
4. Visualize tracks as heat maps for a specific locus. Lot of visualization packages exist in R and also look at deepTools written in Python which includes lot of summarization and comparison functions.
1
Entering edit mode

If you have 200 samples then looking at things in a genome browser is probably not very useful. You'd spend all day scrolling up/down and inevitably lose track of what the various samples look like.

0
Entering edit mode

Hi Sukhdeep I cannot get fetchChromSizes to work...can the data for chicken (gg4.chromSizes) not be downloaded directly from the ucsc website

0
Entering edit mode

What is the error, try with galGal4. This is the direct link, other wise.

0
Entering edit mode

Hi Sukhdeep thanks...I got that to work but now I have a different error...any idea what this might be... Expecting number field 2 line 1 of FGFK27acwith_chr.bedGraph, got type=bedGraph

0
Entering edit mode

You file is not in proper format. See this for hints. Expecting number field 2 line 1 of ......

0
Entering edit mode

Hi, Sukhdeep Singh. Thank you very much for the tutorial. I just had one issue. when I tried to convert my .bdg file from macs14 peak calling, I always had the problem that the coordinate of .bdg file are bigger than the coordinate of chromsize files. I guess this is somewhat related to the peak calling process. what I did was to just manually change the chrom sizes in a minimal manner to get a pass. but I guess that is not a good practice. How do you think? do you have better idea to solve the issue? Tsk!

0
Entering edit mode

Hi, use bedClip (link is from my old blog post).

0
Entering edit mode

Yes, It worked. TKS,sukhdeep!