Question

How to convert bedgraph file with bins into GRanges object?

0

Entering edit mode

2.4 years ago

Svetlana ▴ 10

Hi Everyone!

Hope you could give me some advice on the following problem.

I have a bedgraph file containing chromosome bins and log2 ratios Condition1 vs Condition2 of ChipSeq data. Bins are 50 bp, but windows may be different in size depending on log2:

enter image description here

This file is a result of bigwigCompare (Galaxy) used on hg18 wiggle files downloaded from Gene Expression Omnibus, so I don't have access to raw reads.

From here, I would like to overlap these bins with another file containing peaks and their locations, annotated with ChipPeakAnno R package (here I used EnsDb.Hsapiens.v75 because reads were aligned onto hg19). How can I overcome the difference in reference genome?

Any suggestions?

Thanks!

Svetlana

Bedgraph Bedtools • 1.2k views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 2.4 years ago by Svetlana ▴ 10

score 1 · Answer 1 · 2021-12-16

You could convert your bedGraph bins from hg18 to hg19 using liftover, so you can overlap them with your peaks. You would read them into a GRanges object, then hand this to the liftover function to translate from hg18 to hg19, then unlist the results to get back a regular GRanges object. To do this, you have to have a liftover chain file from UCSC. You can get it via web, ftp, or wget:

# get chain file from UCSC
wget 'http://hgdownload.soe.ucsc.edu/goldenPath/hg18/liftOver/hg18ToHg19.over.chain.gz'

# uncompress it
gunzip hg18ToHg19.over.chain.gz

Then within R, depending on the format of your bedGaph data, you can convert it to GRanges. I'll assume you have a dataframe:

library(GenomicRanges)
library(rtracklayer)

# make some toy data
df <- data.frame(chr=rep("chr1", 4), start=seq(100,250,by=50), end=seq(150,300,by=50), score=rnorm(4, mean=7.5))

# convert it to GRanges
df_gr <- makeGRangesFromDataFrame(df, keep.extra.columns=TRUE)

# import the chain file
chain <- import.chain("hg18ToHg19.over.chain")

# convert from hg18 to hg19
df_hg19_gr <- liftOver(df_gr, chain)

results <- unlist(df_hg19_gr)

check the results:

> df_gr
GRanges object with 4 ranges and 1 metadata column:
      seqnames    ranges strand |     score
         <Rle> <IRanges>  <Rle> | <numeric>
  [1]     chr1   100-150      * |   9.94368
  [2]     chr1   150-200      * |   8.93145
  [3]     chr1   200-250      * |   7.19089
  [4]     chr1   250-300      * |   5.83782
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths
> results
GRanges object with 4 ranges and 1 metadata column:
      seqnames      ranges strand |     score
         <Rle>   <IRanges>  <Rle> | <numeric>
  [1]     chr1 10100-10150      * |   9.94368
  [2]     chr1 10150-10200      * |   8.93145
  [3]     chr1 10200-10250      * |   7.19089
  [4]     chr1 10250-10300      * |   5.83782
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths