Question: Validating ChIP-seq peak-calling output across replicates
gravatar for bioinfc37
3.7 years ago by
bioinfc3730 wrote:

In general, I would like to validate my ChIP-seq output from MACS2. My ChIP-seq dataset contains libraries that are not pure technical replicates -- the biological sample (1 tube) was divided in three samples (three tubes) for sequencing. The variation between samples is likely due to the sequencer. In any case, how may I validate/compare the replicates computationally.

chip-seq • 1.5k views
ADD COMMENTlink modified 3.7 years ago by Sentinel156130 • written 3.7 years ago by bioinfc3730
gravatar for mforde84
3.7 years ago by
mforde841.3k wrote:

You're interested in a irreproducibility discovery rate (IDR) analysis. ENCODE has a standard pipeline for this application. I have a github with a pipeline implementation available as well.

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by mforde841.3k

Nice! however how current is that ENCODE pipeline (your first link)? I've used the main IDR repo recently but was never quite sure how running IDR this way compares. Also how important is it to go through the process of generating and calling peaks from pseudoreplicates (as per the ENCODE pipeline)? Does your pipeline automate this?

ADD REPLYlink written 3.7 years ago by Sentinel156130

Im not sure you really have to worry too much about the current-ness of the encode pipeline as it's still extensively used and the component software (eg., IDR, SPP, MACS2) is still being actively developed. I think of it as a psuedogold standard pipeline (in the absence of validation :) ) for TF chip calling.

I think subsampling just makes the analysis more rigorous. I mean if you see certain peaks in one psuedosample and not the other, or the peaks from baseline are drastically different, it's kinda questionable if it's real signal. But yes, my pipeline has automated the psuedoreplicate portion as well. I'm not sure if it will work out of the box for you, as you'll likely have a different cloud / HPC setup then me. But it should be compatible with a VM running ubuntu 14.04 lts which you can rent off AWS. You'll likely want to go line by line for a small set of test samples, see where things break, make w/e changes are needed, and then throw the kitchen sink at it.

ADD REPLYlink written 3.7 years ago by mforde841.3k
gravatar for Sentinel156
3.7 years ago by
Melbourne, Australia
Sentinel156130 wrote:

OP you could also use the excellent Deeptools2 package to look at variation in your technical reps using the plotPCA/plotCorrelation functions

ADD COMMENTlink written 3.7 years ago by Sentinel156130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2258 users visited in the last hour