How to view bigwig files in UCSC with windowed coverage?
Entering edit mode
4.5 years ago
fr ▴ 210

I am trying to view a few ATACseq and Chipseq bigwig files in UCSC. I have identified the peaks for each file, converted them to bigwig, and was able to upload them and see them in UCSC genome browser. However, what I see in full mode is a bunch of vertical lines:

enter image description here

I have also converted one of the bigwigs to wig, and the head shows as

#bedGraph section chr1:4768516-52764353
chr1    4768516 4768642 1
chr1    4769922 4770097 1
chr1    4780151 4780361 1
chr1    4785434 4786005 1

This is because every read is being plotted, so the Y axis goes between 0 and 1. I.e., there is no windowing happening. So here I have a few questions

  • I'm wondering how I can specify the windows? In UCSC, the window is given with mean+whiskers, max, etc., so it will always show 1 as the end result. But shouldn't it be computing the coverage in a given range?
  • Should I compute the coverage myself in a .bed or .bedgraph, convert to .bw, and then upload?
  • Is the windowing done on the bigwigs, or is it done in the UCSC genomebrowser?
  • Am I making any mistake or misunderstanding anything?
ucsc atac-seq ChIP-Seq • 3.0k views
Entering edit mode

Thanks for your comment. I may be misunderstanding what you want to say, but I have no problems uploading my data to UCSC. The screenshot I gave is from there actually. I have also updated my question's title as the focus is on the "windows" part

Entering edit mode
4.5 years ago

I generate bedGraphs using deepTools bamCoverage for this purpose. This loop will do it, taking an input BAM file listing as input:

mkdir -p out/ ;
mkdir -p out/UCSC/ ;

cat BAM.list | sort | while read BAM ;
  echo "Processing ""${BAM}""..." ;

  echo `samtools sort -O BAM -o tmpsort.bam "${BAM}"` ;
  echo `samtools index tmpsort.bam` ;

  outfile=$(echo "${BAM}" | cut -f3 -d"/" | cut -f1-3 -d"_") ;

  bamCoverage \
    --bam tmpsort.bam \
    --binSize 25 \
    --smoothLength 40 \
    --outFileName out/UCSC/"${outfile}".bedGraph.tmp \
    --outFileFormat bedgraph \
    --region chr2:49918505:51032561 
  echo "track type=bedGraph name=""${outfile}"" description=""${outfile}"" color=100,100,100" > out/UCSC/"${outfile}".bedGraph
  cat out/UCSC/"${outfile}".bedGraph.tmp >> out/UCSC/"${outfile}".bedGraph ;
  rm out/UCSC/"${outfile}".bedGraph.tmp ;
done ;

rm tmpsort.bam tmpsort.bam.bai ;

Please edit this code where necessary.

The config, after upload to UCSC, could then look something like this:




Entering edit mode

Kevin, this is great, thanks so much. Do you see a straightforward way to do this if I have everything as .bedGraph already?

Entering edit mode

Oh, you do not have access to the BAMs?

Entering edit mode

Not anymore, what I got were .bedGraphs and TagDirs :(

Entering edit mode

TagDirs - from HOMER? If 'yes', then HOMER has a function that can make UCSC bedGraph files (see HERE)

If I convert bigWig to bedGraph with bigWigToBedGraph, I still obtain a smoothed bedGraph file for UCSC. Be sure that you have the header of your bedGraph correct:

track type=bedGraph name="test" description="test" color=100,100,100
chr1    10000   10003   0.01789
chr1    10003   10101   0.03578
chr1    10101   10104   0.01789
chr1    11297   11375   0.00716
chr1    11576   11578   0.00596
chr1    11578   11604   0.01342
chr1    11604   11677   0.01789
chr1    11677   11679   0.01193
chr1    11679   11705   0.00447
chr1    11714   11732   0.00596
chr1    11732   11767   0.01044
chr1    11767   11815   0.01342
chr1    11815   11824   0.00745
chr1    11824   11833   0.01193
chr1    11833   11851   0.00745
chr1    11851   11867   0.01491
chr1    11867   11868   0.01193
chr1    11868   11876   0.0164
chr1    11876   11917   0.02714
chr1    11917   11925   0.03012
chr1    11925   11952   0.02564

If all else fails, then, indeed, just summarise the coverage yourself using the information in the bedGraph that you have already created.

Entering edit mode

Dear Kevin, thanks so much, this solved my problem! I found a way to get the bedgraphs and it worked.

I would just like to ask one final question: in general, when you are computing coverage for visualization in a UCSC, or IGV, what window sizes do you usually use? I understand this is very much case dependent, but I'm just trying to get a general sense of what would a sensible windowsize be? 15bps? 25bps?

Entering edit mode

Oh, can be anything, really. A smaller window size will produce a larger file, but provides greater resolution. I mean, if you're going to be zooming in on individual regions, it may help to have a greater resolution (small window size) so that it doesn't look 'blocky'. If you're going to just look at entire gene bodies, then perhaps you can use a larger window size.

Given the data that you have, you probably do not have much room for altering the window size.


Login before adding your answer.

Traffic: 1148 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6