Dear all, I'm new to ChIP-seq and it bothers me to see such discrepancy in what people use as negative control. I will be using endogenously tagged protein and IP with a commercial anti-HA. I see benefits to using input DNA (like for ChIP-chip), IgG or an untagged strain (mostly restricted to yeast). Using a pre-immune IgG seems like a good idea but the very little IP DNA could be highly biased. In my model (the protist Toxoplasma) I would have the luxury to be able to use an untagged strain. My question is: Is there a publication where people have systematically compared the 3 different methods? If not, shouldn't it be done or am I over-worrying here? Thanks!

I found that this recent review gave a good overview for chip-seq including controls.

I think most labs use Input because IgG can be biased (I'm working w/human cells and I have not heard of untagged strain). Excerpt from the above paper:

"IgG may be less desirable in certain circumstances because of the following reasons: most IgG antibodies are not obtained from true preimmune serum from the same animal in which the specific antibody was raised; and IgG antibodies usually immunoprecipitate much less DNA than specific antibodies do, and thus limited genomic regions from the control may be overamplified during the library construction step"

@ Ying W: Thanks for the paper, nice one that I had not seen. Seems like we all agree on the IgG issue.

I'm not aware of any papers explicitly testing this, but here are some thoughts:

  • IgG: I agree with you on this, it's essentially the noise in the experiment. Trying to look at enriched regions by doing a log fold change would result in dividing by close-to-zero, which doesn't make sense. Subtracting the IgG, rather than dividing, would make more sense, but you still wouldn't have a good estimate of the chromatin going into the IP.

  • Untagged strain: this could be good, but I'd argue that this is introducing another variable (i.e., the culturing of the additional strain) that's not controlled for.

  • Input: I think this is the best option. It gives you exactly the chromatin that was available for the IP. Furthermore, peak-calling algorithms are typically written with assumptions appropriate for input as the control. Any possible biological changes to the chromatin induced by the HA-tagged protein will show up in the input, something that you won't see in the untagged strain.

I suppose if you have tons of money to spend and tons of starting material, you could do (IP - IgG) to hopefully control for non-specific binding, and then compare that to input as control -- though I'm not aware of anyone who's actually done this.

@ daler: thanks for the detailed answer. I did not know that the algorithm were specifically designed for control to input. Good idea for the IP-IgG. Do you add IgG to the pre-clear step?

