Dear all, I'm new to ChIP-seq and it bothers me to see such discrepancy in what people use as negative control. I will be using endogenously tagged protein and IP with a commercial anti-HA. I see benefits to using input DNA (like for ChIP-chip), IgG or an untagged strain (mostly restricted to yeast). Using a pre-immune IgG seems like a good idea but the very little IP DNA could be highly biased. In my model (the protist Toxoplasma) I would have the luxury to be able to use an untagged strain. My question is: Is there a publication where people have systematically compared the 3 different methods? If not, shouldn't it be done or am I over-worrying here? Thanks!
I found that this recent review gave a good overview for chip-seq including controls. http://www.nature.com/ni/journal/v12/n10/abs/ni.2117.html
I think most labs use Input because IgG can be biased (I'm working w/human cells and I have not heard of untagged strain). Excerpt from the above paper:
"IgG may be less desirable in certain circumstances because of the following reasons: most IgG antibodies are not obtained from true preimmune serum from the same animal in which the specific antibody was raised; and IgG antibodies usually immunoprecipitate much less DNA than specific antibodies do, and thus limited genomic regions from the control may be overamplified during the library construction step"
I'm not aware of any papers explicitly testing this, but here are some thoughts:
IgG: I agree with you on this, it's essentially the noise in the experiment. Trying to look at enriched regions by doing a log fold change would result in dividing by close-to-zero, which doesn't make sense. Subtracting the IgG, rather than dividing, would make more sense, but you still wouldn't have a good estimate of the chromatin going into the IP.
Untagged strain: this could be good, but I'd argue that this is introducing another variable (i.e., the culturing of the additional strain) that's not controlled for.
Input: I think this is the best option. It gives you exactly the chromatin that was available for the IP. Furthermore, peak-calling algorithms are typically written with assumptions appropriate for input as the control. Any possible biological changes to the chromatin induced by the HA-tagged protein will show up in the input, something that you won't see in the untagged strain.
I suppose if you have tons of money to spend and tons of starting material, you could do (IP - IgG) to hopefully control for non-specific binding, and then compare that to input as control -- though I'm not aware of anyone who's actually done this.