I am seeking some advice on how to pair a normalization method with spike-in yeast DNA in a time course CUT&RUN experiment with 5 timepoints and 3 replicates per time point. I am using MACS3 to perform peak calls and DiffBind to test for differential binding (using the DESeq2 workflow) between the 0 hr timepoint and subsequent timepoints.
I have tried (a) normalizing tag counts according to spike-in library size and (b) normalizing according to spike-in relative log expression (RLE). In Figure 32 on page 55 of the Diffbind vignette, these two normalization method produce nearly identical results.
However, in my case, I am getting different results. Namely, one time point always produces 0 differentially bound sites, and choice of normalization method determines at which time point the 0-result occurs:
When I normalize by spike-in library size, I obtain the following results table (Group, refers to the time point of comparison, Group2 refers to the 0 hr timepoint, and DB.DESeq2 shows number of differentially bound sites FDR<0.05 at that contrast):
But when I normalize by spike-in RLE, I obtain the following results table:
When I look at the normalized tag count profiles for the top 5 sites identified in each normalization setting, the tag count profiles display different patterns which likely account for the differences in the dba results.
Shown below are my spike-in library sizes per replicate (colored by timepoint). Note that the 0 hr and 24 hr groups have the most variability in the spike-in library size between replicates.
I have two questions that I would like feedback on:
- How do we choose an appropriate normalization method to pair with spike-in data? I understand that library size vs. RLE normalization methods have different underlying assumptions, but I don't know how to assess whether these assumptions are reasonable in the spike-in data. For example, an assumption of RLE normalization is balanced differential expression; this doesn't make any sense to me in the context of a spike-in.
- Why are my results so different between these two normalization methods? Is the variability in spike-in library sizes driving a wedge between these normalization methods somehow?
I would appreciate any advice! Thanks! -Ethan