Normalizing ChIP-seq Data To Spike-In Control
Entering edit mode
3.0 years ago
megannj ▴ 20

Hello, all!

I am currently working on normalizing some ChIP-seq data I've generated to a spike-in control. The ChIP was performed in mouse cells and our spike-in was human chromatin.

I've referenced the active motif kit instructions on how to do this ( and I've talked to a few different individuals, but am still not sure exactly how to perform this normalization.

I've tried several different methods, but here is the general workflow:

  1. Align to a merged genome containing both mouse and human chromosomes.

  2. Pull out all alignments to the human genome and count them.

  3. Use the sample with the smallest number of reads mapping to the human genome to create a normalization factor. For example:

    • Sample 1 (1,000,000 reads)
    • Sample 2 (2,000,000 reads) | Normalization Factor = 1,000,000/2,000,000 = 0.5
    • Sample 3 (3,000,000 reads) | Normalization Factor = 1,000,000/3,000,000 = 0.33
  4. Using the reads that aligned to the mouse genome, pull the subset of reads designated by the normalization factor and map these. I have been subsetting reads using the samtools -s option. For example:

    • Sample 1 (10,000,000 mouse reads) --> Map all 10,000,000
    • Sample 2 (30,000,000 mouse reads) --> Map 50% of these, or 15,000,000
    • Sample 3 (60,000,000 mouse reads) --> Map 33% of these, or 20,000,000
  5. Create a bed file from the sam file and extend the reads.

  6. Use this bed file to generate a bigwig file for viewing data.

To be more specific, I have a WT line, a heterozygous knockout, and a homozygous knockout of a protein and I've done ChIP for that protein in each line. However, when I normalize using the aforementioned method, the homozygous knockout has higher signal than the wild type at binding sites of the protein that is knocked out.

Does my method sound correct? Can anyone provide a script or instructions on how exactly to perform a spike-in normalization?

Thanks for all your help!


ChIP-Seq normalization spike-in genomics • 3.0k views
Entering edit mode

Did you find a way to solve this problem? I am having the exact same issue!


Login before adding your answer.

Traffic: 2804 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6