Question: I am having trouble understanding how to begin this ChIP-Seq analysis.
gravatar for dally
23 months ago by
United States
dally140 wrote:

I'd like to preface this post by saying that I recently began working in a new lab about two weeks ago, and have spent a large majority of my time learning to analyze ChIP-Seq data provided by other lab members. I've managed to learn quite a lot on my own (using command line interface, how to use a variety of tools such as bedtools, homer, MACS2) and at least in theory understand how peak calling, annotation, and motif discovery work.

My main problem at the moment is I have no idea where to exactly start so that I can generate the type of data I am interested in.

Basically I have been given 8 files containing ChIP Seq data for various proteins / histone marks (Pol II, KAP1, H3K27Ac, etc.) ... These files have been aligned, and have had their peaks called using MACS2 so I have a generated Protein_MACS2.summits.bed file for each protein. 

I am interested in finding the overlapping regions of H3K27Ac, H3K4me1, and Pol II in Exons only (and eventually introns). I then want to generate a metagene plot and/or heatmap of this data that is centered on the beginning or the middle of exons. 

For instance, I have seen many metagene plots and heatmaps of Pol II constructed but all of them seem to be centered on the TSS and I can't for the life of me figure out how to center this information on exon starts, or the middle of exons.

My guess is that I must first find some sort of reference data in order to determine where the peaks in my ChIP-Seq data are found in exons, but after searching UCSC I can't really find what I'm looking for ... and I'm aware there is a Exon database of hg19 in the Table Browser ... but what exactly do I do with this data after I have downloaded it in BED format?

As you can probably tell, i've spent the better part of 3 days attempting to find any sort of solution or tool or really just anything that could help me out but i'm at a loss.

I'd appreciate very detailed answers with tools and step by step guides on what to do. However i'm aware you are all very busy so a simple guideline of "use this tool to get this, then use this the output in this tool to generate this" would be much appreciated. I will figure out how the tools work and what commands must be used.

new chip-seq analysis • 1.2k views
ADD COMMENTlink modified 23 months ago by Devon Ryan70k • written 23 months ago by dally140
gravatar for Devon Ryan
23 months ago by
Devon Ryan70k
Freiburg, Germany
Devon Ryan70k wrote:

You're likely to find deepTools useful (N.B., I'm affiliated with the authors, so I'm biased). It's mostly oriented toward ChIPseq and could be used to create the metagene plots you want (we would call that a signal profile and usually plot a heatmap along with it), such as here, though the profile has been cropped off (scroll up for an example of a profile). The input to something like that is a BED file with the regions of interest (exons in your case, though I suspect you'll get more meaningful results with a BED12 file of the transcripts) and a bigWig file, which you could either make from your BAM files or from the peaks (I generally prefer the former, since it's unbiased by the peak caller).

We also have everything available as a public Galaxy instance, though if you're already familiar with the command line you probably don't need it.

ADD COMMENTlink written 23 months ago by Devon Ryan70k

I have a bigWig file that was generated by our previous bioinformaticist. To get the BED / BED12 file would I simply use the Table Browser in UCSC and download a bed file of Exons Only for whatever genome I will be using? (In this case hg19)

ADD REPLYlink written 23 months ago by dally140

Yup, for a BED12 you would use something like this, which you can just get via FTP from UCSC (refGene is a common source of things like this).

Edit: I should add that I've never personally used anything other than a BED12 file, so I don't know how many columns you actually need (you might need at least 6).

ADD REPLYlink modified 23 months ago • written 23 months ago by Devon Ryan70k

Hey Devon I have a follow up question to this. Say I have a region file of exons, i'd like to plot from the center. Would I use scaled-regions or reference-point method of deepTools? And if I use reference point, would I simply type "center" as the --referencePoint?

ADD REPLYlink written 22 months ago by Bioradical40

Scale-regions, since that would make the most since for looking at the coverage over a variable-width feature.

ADD REPLYlink written 22 months ago by Devon Ryan70k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 651 users visited in the last hour