Question

Differential Peak Analysis from ENCODE Pipeline normalized outputs

0

Entering edit mode

4.5 years ago

sjquon • 0

I processed my ChIP-seq and ATAC samples using the ENCODE pipeline (https://github.com/ENCODE-DCC/chip-seq-pipeline2 and https://github.com/ENCODE-DCC/atac-seq-pipeline). These pipelines gives outputs such as aligned bam files, bigwig files for signal normalized to control, and IDR filtered big bed peak files.

When my samples are not normalized, they have an efficiency difference so that some samples overall have higher peaks than others. The ENCODE pipeline corrects this problem by correcting to controls, but to my knowledge that output is a bigwig file. I am concerned that if I look at differential peaks using the bam files to look for enrichment, then many peaks will be lost due to the efficiency difference. On the other hand, I can't seem to find any programs where I can use a bigwig file and a peak file to calculate differential peaks.

Does anyone have any good strategies for calculating differential peaks from the normalized outputs of the ENCODE pipeline?

ChIP-Seq ATAC ENCODE • 1.8k views

ADD COMMENT • link updated 4.5 years ago by ATpoint 82k • written 4.5 years ago by sjquon • 0

score 0 · Answer 1 · 2019-10-18

The ENCODE pipeline corrects this problem by correcting to controls

I do not know the ENCODE pipeline but I find this unlikely. YOu have to normalize between samples, not to any controls as these controls (in ChIP-seq) are typically only used for peak calling to distinguish signal (= true immunoprecipitation by the antibody) from noise (=unspecific binding to the antibody, IgG control). Typically one calls peaks over the samples, creates a count matrix and then uses tools like edgeR or DESeq2 for normalization and differential analysis. This can be done from BAM files using e.g. featureCounts after peak calling with macs2. Don't use these bigwig files. Differential analysis should start from raw counts, see the manuals from edgeR and DESeq2 for details. Don't ever make conclusions by comparing browser tracks directly from BAM files as they are completely un-normalized and any difference you see can be due to different sequencing depth. It is not informative to do that without normalization. Normalized bigwigs (at least normalized for depth) can be done by bamCoverage from deeptools.