Question: batch effect for ChIP-seq data
5.1 years ago by
Ming Tang2.6k
Houston/MD Anderson Cancer Center
Ming Tang2.6k wrote:

Hi everyone,

If I want to compare ChIP-seq data from different sequencing projects, say epigenome roadmap vs ENCODE.

How do you normalize across samples? Is it similar to RNA-seq data that one needs to correct batch effect

I know MAnorm  and others can do for samples from same project.

I just want to know how you deal with it for ChIP-seq data.



ADD COMMENTlink written 5.1 years ago by Ming Tang2.6k
5.0 years ago by
jotan1.2k wrote:

I generally do not normalise for ChIP-seq although I would be curious to know if others do. The ChIP protocol itself is highly variable and different modifications/transcription factors can also have vastly different binding profiles, so I'm not certain that applying normalisation would be meaningful. I usually call peaks independently and overlap, only normalising for total read counts.

ADD COMMENTlink written 5.0 years ago by jotan1.2k

How about comparing the same histone mark among different samples from different sequencing centers.

ADD REPLYlink written 5.0 years ago by Ming Tang2.6k

That's pretty tricky. I would only be comfortable doing that for very robust histone modifications like H3K4me3. Otherwise, the variability will be quite high. There are also a few other points to keep in mind.

  1. ChIP-seq is not really quantitative. I don't think anyone knows what the signal strength actually represents.
  2. The ChIP protocols are not uniform and most labs often have their own signature protocol.

For a robust histone modification, I would normalise for total read counts (randomly extract reads to match the smallest file), call peaks individually and overlap the peaks.

ADD REPLYlink modified 12 months ago by _r_am31k • written 5.0 years ago by jotan1.2k

Thanks for sharing your tips!

ADD REPLYlink written 5.0 years ago by Ming Tang2.6k
