Question

Why does DiffBind use the count of reads that fall within peaks?

0

Entering edit mode

2.1 years ago

Aspire ▴ 330

DiffBind starts with peaks (per sample), and then counts the reads that fall within those peaks. Differential expression of the counts of the reads is calculated using DESeq2 / edgeR.

What is the reason for the move from peaks to reads? What benefit does it yield that would not be possible otherwise?

DiffBind • 643 views

ADD COMMENT • link updated 2.1 years ago by Rory Stark ★ 2.0k • written 2.1 years ago by Aspire ▴ 330

3

Entering edit mode

2.1 years ago

jared.andrews07 ★ 16k

It allows you to quantitate differences in instances where a peak is called in both groups, but the magnitude of said peak is not the same. So rather than a binary question of "is the peak shared between groups", the question is "does the signal under the peak differ between groups?", which is often informative.

ADD COMMENT • link 2.1 years ago by jared.andrews07 ★ 16k

score 2 · Accepted Answer · 2022-04-06

Modelling the distribution of reads across all replicates in the sample groups (regardless of whether a peak was identified in any specific sample replicate) enables a robust quantitative assessment of the evidence for differential binding, including calculation of useful statistics such as p-value, FDR, and fold change (using the underlying statistical analysis packages).

The DiffBind vignette has a section that compares the results of an occupancy analysis using only peak calls to an affinity analysis that models read counts.