I'd like to study processing CHIPseq data by myself, I choose a paper to minic their data analysis steps.The details for these data : cell line: MCF7 // Illumina HiSeq 2000 // 50bp // Single ends // phred+33
Short answer: yes it's absolutely possible that the control samples really are sequenced significantly deeper than the ChIP samples. The degree of difference in that dataset is a bit...extreme, but perhaps they had other reasons (e.g., they were planning to use the same controls for more deeply sequenced samples, or they expected high ChIP efficiency with few actual binding events). It's best not to skimp on the control sample depth, normally doing at least as much control as ChIP. Why? Because once you do peak calling you need to scale your samples, and in order to not scale noise you scale the more deeply sequenced sample down. If you have to scale down your input then it'll still be a bit better for determining things like local variance levels than the other way around (I'm thinking in terms of peak calling with MACS2, so YMMV with other methods).