Tutorial:bedGraphToBigWig Tutorial and Report
0
11
Entering edit mode
5.8 years ago
Shicheng Guo ★ 8.8k

It is too easy to make error report in the bedGraphToBigWig process. I want to save the time for the fresh people. The following procedure would be work well for majority situations.

1, bedGraph should be without header before sorting

awk 'NR!=1' input.bedGraph > input.deheader.bedGraph

2, bedGraph should be sorted

  sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph

3, chorsome length should be same with Bam files

 fetchChromSizes hg19 > hg19.chrom.sizes

​4, Be sure the bedGraph should only have 4 column

awk '{print $1,$2,$3,$4}' NC-P-2.bedGraph_CpG.sort.bedGraph > NC-P-2.bedGraph_CpG.sort.4.bedGraph


5, Now, you can run the script (bedGraphToBigWig input.sort.bam chrome.size output.bw)

Summary, You can use the following two step to do bedGraphtobigwig transformation:

(head -n 1 NC-P-25.bedGraph_CpG.bedGraph && tail -n +2 NC-P-25.bedGraph_CpG.bedGraph | sort -k1,1 -k2,2n | awk '{print $1,$2,$3,$4}' OFS="\t" )  > NC-P-25.bedGraph_CpG.bedGraph.sort

bedGraphToBigWig NC-P-25.bedGraph_CpG.bedGraph.sort  hg19.chrom.sizes  NC-P-2.bw

For large number of bedGraph files:

# bedgraph to bigwig
for i in ls *bedGraph
do
(head -n 1 $i && tail -n +2$i | sort -k1,1 -k2,2n | awk '{print $1,$2,$3,$4}' OFS="\t" )  > $i.sort bedGraphToBigWig$i.sort ~/oasis/db/hg19/hg19.chrom.sizes \$i.bw
done


..

bedGraphToBigWig error expecting sort Tutorial • 7.3k views
1
Entering edit mode

It is fine to have this done, I would like to ask you to also put from where one get fetchChromSizes with the link In any case there are numerous ways to do that. I can simply link to HOMER as it is a suite involving numbers *seq pipelines. Obviously for a newbie it will be important. However as you know there are some softwares which provides the chromsome sizes directly. There is also an error in the post fetchChromSizes hg19 > hg18.chrom.sizes. It should be hg19.chrom.sizes

Also the order is a bit tricky here for the file naming convention. If you can make it simpler for newbies. Appreciate your effort. It can be more enriched.

Another thing is the sort command (I appreciate the sorting you implemented) it is also available in bedops as well. People should also know its power and you can put it as well if you want (just an advice)

0
Entering edit mode

Note that historically there are small differences in the way that NCBI, EBI and UCSC name the chromosomes. What is "MT" for EBI, is called "chrMT" for NCBI and "chrM" for UCSC. If you used a genome not from UCSC for your analysis, you may have to fix up these small differences. To convert EBI or NCBI chrom names to UCSC chrom names in a wig or bed text file, you can use UCSC's little utility chromToUcsc. Download it with "wget https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/chromToUcsc", make it executable with "chmod a+x chromToUcsc" (it's a python2/3 script) and run it without arguments to get the usage message. Here is an example call: chromToUcsc -g hg19 --get && chromToUcsc -i test.wig -o test.ucsc.wig -a hg19.chromAlias.tsv -g hg19

0
Entering edit mode

Hi, I'm using only Chr1 as my reference genome. So, FetchChromSizes hg19 shouldn't work for me, right? What could I use?