Question: Obtain ChIPseq peaks from a bedgraph file with continuous regions
gravatar for mmaqueda
22 months ago by
mmaqueda0 wrote:

Hi all,

I have a bedGraph file from a ChIP-seq experiment (H3K27me3) downloaded from GEO (GSE84324). I want to obtain a list of genes with this mark enriched. However, the bedGraph shows a continuous signal with adjacent regions while I was expecting selected regions where I just needed to annotate genes.

I've read in the corresponding paper that they used HOMER for generating the bedGraph: makeUCSCfile out_dir –o out.bdg –name sample_name -color track_color –fragLength 150 –avg -fsize 1e20

I am quite new to ChIP-seq analysis and need some advice here on how to proceed. My thoughts are:

a) Use MACS2 bdgbroadcall using the bedGraph as input. Problem: bedGraph has not been generated with MACS/MACS2.

b) Use ScoreMatrixBin from R (genomation package) to get an score on all promoters and then apply a cutoff to identify peaks (in the original paper, they proceed like this). This may be a naive approach....

c) Try to obtain the ChIP-seq raw data and repeat the analysis.

Please, could you give me some advice? Any other proposal is welcome.

Thanks in advance!


ADD COMMENTlink modified 19 months ago by Biostar ♦♦ 20 • written 22 months ago by mmaqueda0

Can you add some details? Against which background you aim to show that certain promoters are enriched? Can you line that paper?

ADD REPLYlink written 22 months ago by ATpoint46k

Hi ATpoint,

In the same GEO repository, there is an input (bedGraph format) for the same experiment. So I could use this info as background to compare scores.

You can find the paper in:

ADD REPLYlink written 22 months ago by mmaqueda0

I would always download raw data, see Fast download of FASTQ files from the European Nucleotide Archive (ENA) and sra-explorer : find SRA and FastQ download URLs in a couple of clicks. Then call peaks with macs2 and its broad peak option against the control and intersect the peaks with the promoter coordinates.

Fyi, a bedGraph is a (in most cases) genome-wide intensity track that tells you how many reads mapped to any location across the genome. For peak calling it is commonly not used but rather for visualization in a genome browser or to extract information to be used to make profile plots or similar.

ADD REPLYlink written 22 months ago by ATpoint46k

Thanks for your help ATpoint!

Yes, I agree with you about working with the raw data in these cases. I guess that is the best option. Thanks for your explanation about bedGraph, I did not know before this task but realize it once I started working with the file.



ADD REPLYlink written 21 months ago by mmaqueda0

you can call peaks from bedgraph using: macs2 bdgpeakcall usage: macs2 bdgpeakcall [-h] -i IFILE [-c CUTOFF] [-l MINLEN] [-g MAXGAP] [--cutoff-analysis] [--no-trackline] [--outdir OUTDIR] (-o OFILE | --o-prefix OPREFIX) macs2 bdgpeakcall: error: the following arguments are required: -i/--ifile

ADD REPLYlink written 3 months ago by Ming Tang2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1947 users visited in the last hour