Obtain ChIPseq peaks from a bedgraph file with continuous regions
0
0
Entering edit mode
3.1 years ago
mmaqueda • 0

Hi all,

I have a bedGraph file from a ChIP-seq experiment (H3K27me3) downloaded from GEO (GSE84324). I want to obtain a list of genes with this mark enriched. However, the bedGraph shows a continuous signal with adjacent regions while I was expecting selected regions where I just needed to annotate genes.

I've read in the corresponding paper that they used HOMER for generating the bedGraph: makeUCSCfile out_dir –o out.bdg –name sample_name -color track_color –fragLength 150 –avg -fsize 1e20

I am quite new to ChIP-seq analysis and need some advice here on how to proceed. My thoughts are:

a) Use MACS2 bdgbroadcall using the bedGraph as input. Problem: bedGraph has not been generated with MACS/MACS2.

b) Use ScoreMatrixBin from R (genomation package) to get an score on all promoters and then apply a cutoff to identify peaks (in the original paper, they proceed like this). This may be a naive approach....

c) Try to obtain the ChIP-seq raw data and repeat the analysis.

Please, could you give me some advice? Any other proposal is welcome.

Maria

ChIP-Seq peaks calling HOMER MACS2 bedGraph • 2.6k views
0
Entering edit mode

Can you add some details? Against which background you aim to show that certain promoters are enriched? Can you line that paper?

0
Entering edit mode

Hi ATpoint,

In the same GEO repository, there is an input (bedGraph format) for the same experiment. So I could use this info as background to compare scores.

You can find the paper in:

http://dev.biologists.org/content/145/6/dev163162

1
Entering edit mode

I would always download raw data, see Fast download of FASTQ files from the European Nucleotide Archive (ENA) and sra-explorer : find SRA and FastQ download URLs in a couple of clicks. Then call peaks with macs2 and its broad peak option against the control and intersect the peaks with the promoter coordinates.

Fyi, a bedGraph is a (in most cases) genome-wide intensity track that tells you how many reads mapped to any location across the genome. For peak calling it is commonly not used but rather for visualization in a genome browser or to extract information to be used to make profile plots or similar.

0
Entering edit mode

Yes, I agree with you about working with the raw data in these cases. I guess that is the best option. Thanks for your explanation about bedGraph, I did not know before this task but realize it once I started working with the file.

Regards,

Maria

0
Entering edit mode

you can call peaks from bedgraph using: macs2 bdgpeakcall usage: macs2 bdgpeakcall [-h] -i IFILE [-c CUTOFF] [-l MINLEN] [-g MAXGAP] [--cutoff-analysis] [--no-trackline] [--outdir OUTDIR] (-o OFILE | --o-prefix OPREFIX) macs2 bdgpeakcall: error: the following arguments are required: -i/--ifile

Traffic: 925 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.