4.4 years ago by
Taiwan/Taichung/China Medical University Hospital
I have a similar question before. Actually, the sentences you show are not easy to understand for a new hand. The below are better descriptions from their papers and an example I try to explain for your reference.
Feng, J. X., T. Liu, B. Qin, Y. Zhang & X. S. Liu (2012) Identifying ChIP-seq enrichment using MACS. Nat. Protoc., 7, 1728.
Estimating the empirical false discovery rate by exchanging ChIP-seq and control samples. When a control sample is available, MACS can also estimate an empirical FDR for every peak by exchanging the ChIP-seq and control samples and identifying peaks in the control sample using the same set of parameters used for the ChIP-seq sample. Because the control sample should not exhibit read enrichment, any such peaks found by MACS can be regarded as false positives. For a particular P value threshold, the empirical FDR is then calculated as the number of control peaks passing the threshold divided by the number of ChIP-seq peaks passing the same threshold.
Zhang, Y., T. Liu, C. A. Meyer, J. Eeckhoute, D. S. Johnson, B. E. Bernstein, C. Nussbaum, R. M. Myers, M. Brown, W. Li & X. S. Liu (2008) Model-based Analysis of ChIP-Seq (MACS). Genome Biol., 9.
For a ChIP-Seq experiment with controls, MACS empirically estimates the false discovery rate (FDR) for each detected peak using the same procedure employed in the previous ChIP-chip peak finders MAT  and MA2C . At each p- value, MACS uses the same parameters to find ChIP peaks over control and control peaks over ChIP (that is, a sample swap). The empirical FDR is defined as Number of control peaks / Number of ChIP peaks…. To compare their prediction specificity, we swapped the ChIP and control samples, and calculated the FDR of each algorithm as Number of control peaks / Number of ChIP peaks using the same parameters for ChIP and control.
For example, MACS uses a H3K27ac ChIP sample as the treat and an Input sample as the control to identify a peak, called as peak X, with p-value = 0.00024 based on the parameters you set. Overall, there are 1,000 peaks whose p-value ≦ 0.00024 (i.e.multiple peaks for a given P-value). After that, MACS uses the Input sample as the treat and the H3K27ac ChIP sample as the control to identify peaks again. If totally there are 48 peaks in the Input over the H3K27ac ChIP whose p-value ≦ 0.00024. The FDR for the peak X = 48 / 1000 = 0.048. It means that based on the same threshold, only 4.8 peaks out of 100 peaks in our H3K27ac ChIP sample are false positive, if we believe that all Input peaks are not real enrichment.
modified 4.4 years ago
4.4 years ago by
Gary • 470