Question

DiffBind gives peaks of different widths, even when setting "summits"

0

Entering edit mode

3.8 years ago

loveleaves102 • 0

Hello,

I'm using DiffBind to identify differential binding affinity of histone marks H3k27ac between two conditions. I set the summits parameter in dba.count(), but the resulting peaks still have different widths. For example:

seqname  start             end                width    
6         29330809        29331809          1001
9         43674228        43675228          1001
1         84561539        84563500          1962

Can you guys help explain this to me? Thank you a lot! The following is my code:

enhancer <- dba(sampleSheet = sample, peakFormat = "narrow", scoreCol = 9)
enhancer <- dba.count(enhancer, minOverlap = 4,  summits = 500, score=DBA_SCORE_TMM_MINUS_FULL)
enhancer <- dba.contrast(enhancer, enhancer$masks$cond1, enhancer$masks$cond2, "cond1", "con2")
enhancer <- dba.analyze(enhancer, method=DBA_ALL_METHODS, bSubControl = FALSE, bFullLibrarySize = FALSE, bTagwise = TRUE)
enhancer.DB <- dba.report(enhancer, method=DBA_EDGER)

This is my session info:

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux

Matrix products: default
BLAS/LAPACK: /path/to/lib/libopenblas_haswell-r0.3.4.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] DiffBind_2.14.0             SummarizedExperiment_1.16.1
 [3] DelayedArray_0.12.3         BiocParallel_1.20.1
 [5] matrixStats_0.56.0          Biobase_2.46.0
 [7] GenomicRanges_1.38.0        GenomeInfoDb_1.22.1
 [9] IRanges_2.20.2              S4Vectors_0.24.4
[11] BiocGenerics_0.32.0

loaded via a namespace (and not attached):
  [1] Category_2.52.1          bitops_1.0-6             bit64_0.9-7
  [4] RColorBrewer_1.1-2       progress_1.2.2           httr_1.4.1
  [7] Rgraphviz_2.30.0         backports_1.1.6          tools_3.6.0
 [10] R6_2.4.1                 KernSmooth_2.23-16       DBI_1.1.0
 [13] colorspace_1.4-1         withr_2.1.2              tidyselect_1.0.0
 [16] prettyunits_1.1.1        bit_1.1-15.2             curl_4.3
 [19] compiler_3.6.0           graph_1.64.0             cli_2.0.2
 [22] rtracklayer_1.46.0       checkmate_2.0.0          caTools_1.18.0
 [25] scales_1.1.0             genefilter_1.68.0        RBGL_1.62.1
 [28] askpass_1.1              rappdirs_0.3.1           stringr_1.4.0
 [31] digest_0.6.25            Rsamtools_2.2.3          AnnotationForge_1.28.0
 [34] XVector_0.26.0           jpeg_0.1-8.1             pkgconfig_2.0.3
 [37] BSgenome_1.54.0          dbplyr_1.4.2             limma_3.42.2
 [40] rlang_0.4.5              RSQLite_2.2.0            GOstats_2.52.0
 [43] hwriter_1.3.2            gtools_3.8.2             dplyr_0.8.5
 [46] VariantAnnotation_1.32.0 RCurl_1.98-1.1           magrittr_1.5
 [49] GO.db_3.10.0             GenomeInfoDbData_1.2.2   Matrix_1.2-18
 [52] Rcpp_1.0.4.6             munsell_0.5.0            fansi_0.4.1
 [55] lifecycle_0.2.0          yaml_2.2.1               stringi_1.4.6
 [58] edgeR_3.28.1             zlibbioc_1.32.0          gplots_3.0.3
 [61] BiocFileCache_1.10.2     grid_3.6.0               blob_1.2.1
 [64] ggrepel_0.8.2            gdata_2.18.0             crayon_1.3.4
 [67] lattice_0.20-41          splines_3.6.0            Biostrings_2.54.0
 [70] annotate_1.64.0          GenomicFeatures_1.38.2   hms_0.5.3
 [73] batchtools_0.9.13        locfit_1.5-9.4           pillar_1.4.3
 [76] rjson_0.2.20             systemPipeR_1.20.0       base64url_1.4
 [79] biomaRt_2.42.1           XML_3.99-0.3             glue_1.4.0
 [82] ShortRead_1.44.3         latticeExtra_0.6-29      data.table_1.12.8
 [85] vctrs_0.2.4              png_0.1-7                gtable_0.3.0
 [88] openssl_1.4.1            purrr_0.3.3              amap_0.8-18
 [91] assertthat_0.2.1         ggplot2_3.3.0            xtable_1.8-4
 [94] survival_3.1-12          pheatmap_1.0.12          tibble_3.0.0
 [97] GenomicAlignments_1.22.1 AnnotationDbi_1.48.0     memoise_1.1.0
[100] brew_1.0-6               ellipsis_0.3.0           GSEABase_1.48.0

ChIP-Seq • 1.4k views

ADD COMMENT • link updated 3.8 years ago by Rory Stark ★ 2.0k • written 3.8 years ago by loveleaves102 • 0

score 2 · Accepted Answer · 2020-07-04

2

Entering edit mode

3.8 years ago

Rory Stark ★ 2.0k

DiffBind always merges overlapping peak intervals. In your case, when you center around summits to get 1001pp intervals (500bp up and downstream of the summit), some of those 1k intervals themselves overlap and are merged.

ADD COMMENT • link 3.8 years ago by Rory Stark ★ 2.0k

1

Entering edit mode

Note that if you run dba.count() again using the consensus peaks from the first call, it will find the summit of the merged peak and re-center, leaving you with all peaks the same width (but one fewer peak)..

ADD REPLY • link 3.8 years ago by Rory Stark ★ 2.0k