Diffbind not calling sites as differentially bound
1
0
Entering edit mode
20 months ago
ipsc-jl • 0

Hi Rory,

I am using DiffBind to identify sites differentially bound (affinity analysis) by a transcription factor in control (WT), heterozygous (HET), and knockout (KO) cells. The data was generated using CUT&RUN. The TF was profiled for each genotype, and each genotype also has a negative control Rb IgG bam file.

I used both default normalization method and spike in normalization. The results are similar for both. DiffBind identifies a good number of sites that are significantly differentially bound between WT and KO. Comparing HET v. KO shows a reduction in sites that are significantly differentially bound, as expected.

What I am most interested in is which sites have significantly reduced peak height between WT and HET.

However, there is only 1 site that is significant when comparing WT v. HET. But on visual inspection, many sites have peaks that are about 50% reduced in peak height in the HET compared to WT. I am wondering why these sites are not being identified as significantly (FDR<0.05) differentially bound and if you have advice on how to capture these sites.

Is it possible to get a comparison that is (WT v. KO) v. (HET v. KO)? (I'm assuming WT v. HET is different from this since it does not account for KO).

Thank you for your time!

# spike in normalization
dbObj = dba.normalize(dbObj, normalize=DBA_NORM_LIB, spikein=TRUE)
dbObj = dba.analyze(dbObj)

dba.contrast(dbObj, bGetCoefficients=T)
[1] "Intercept"           "Condition_HET_vs_WT" "Condition_KO_vs_WT" 

# results 
Design: [~Condition] | 3 Contrasts:
     Factor Group Samples Group2 Samples2 DB.edgeR DB.DESeq2
1 Condition    WT       7    HET        6        1         1
2 Condition    WT       7     KO        6     6453      3652
3 Condition   HET       6     KO        6     4044      2711

# please also include the results of running the following in an R session 

sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /programs/biogrids/x86_64-linux/r/4.1/lib/libopenblasp-r0.3.17.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] BiocParallel_1.28.3         csaw_1.28.0                
 [3] profileplyr_1.10.2          RColorBrewer_1.1-3         
 [5] forcats_0.5.1               stringr_1.4.0              
 [7] dplyr_1.0.9                 purrr_0.3.4                
 [9] readr_2.1.2                 tidyr_1.2.0                
[11] tibble_3.1.7                ggplot2_3.3.6              
[13] tidyverse_1.3.1             DiffBind_3.2.7             
[15] SummarizedExperiment_1.24.0 Biobase_2.54.0             
[17] MatrixGenerics_1.6.0        matrixStats_0.62.0         
[19] GenomicRanges_1.46.1        GenomeInfoDb_1.30.0        
[21] IRanges_2.28.0              S4Vectors_0.30.1           
[23] BiocGenerics_0.40.0        

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3                           
  [2] rtracklayer_1.54.0                       
  [3] R.methodsS3_1.8.1                        
  [4] coda_0.19-4                              
  [5] bit64_4.0.5                              
  [6] irlba_2.3.5                              
  [7] DelayedArray_0.20.0                      
  [8] R.utils_2.11.0                           
  [9] data.table_1.14.2                        
 [10] hwriter_1.3.2.1                          
 [11] KEGGREST_1.34.0                          
 [12] RCurl_1.98-1.6                           
 [13] doParallel_1.0.17                        
 [14] generics_0.1.1                           
 [15] GenomicFeatures_1.46.5                   
 [16] org.Mm.eg.db_3.14.0                      
 [17] preprocessCore_1.56.0                    
 [18] EnrichedHeatmap_1.22.0                   
 [19] RSQLite_2.2.14                           
 [20] shadowtext_0.1.0                         
 [21] bit_4.0.4                                
 [22] tzdb_0.3.0                               
 [23] enrichplot_1.14.2                        
 [24] xml2_1.3.3                               
 [25] lubridate_1.8.0                          
 [26] assertthat_0.2.1                         
 [27] viridis_0.6.2                            
 [28] amap_0.8-18                              
 [29] apeglm_1.16.0                            
 [30] hms_1.1.1                                
 [31] fansi_0.5.0                              
 [32] restfulr_0.0.13                          
 [33] progress_1.2.2                           
 [34] caTools_1.18.2                           
 [35] dbplyr_2.1.1                             
 [36] readxl_1.4.0                             
 [37] geneplotter_1.72.0                       
 [38] igraph_1.3.1                             
 [39] DBI_1.1.2                                
 [40] htmlwidgets_1.5.4                        
 [41] ellipsis_0.3.2                           
 [42] rGREAT_1.24.0                            
 [43] backports_1.4.1                          
 [44] annotate_1.72.0                          
 [45] biomaRt_2.50.3                           
 [46] vctrs_0.4.1                              
 [47] cachem_1.0.6                             
 [48] withr_2.4.3                              
 [49] ggforce_0.3.3                            
 [50] BSgenome_1.62.0                          
 [51] bdsmatrix_1.3-4                          
 [52] GenomicAlignments_1.30.0                 
 [53] treeio_1.18.1                            
 [54] prettyunits_1.1.1                        
 [55] TxDb.Mmusculus.UCSC.mm10.knownGene_3.10.0
 [56] cluster_2.1.3                            
 [57] DOSE_3.20.1                              
 [58] ape_5.6                                  
 [59] lazyeval_0.2.2                           
 [60] crayon_1.4.2                             
 [61] genefilter_1.76.0                        
 [62] edgeR_3.36.0                             
 [63] pkgconfig_2.0.3                          
 [64] tweenr_1.0.2                             
 [65] nlme_3.1-157                             
 [66] rlang_1.0.2                              
 [67] lifecycle_1.0.1                          
 [68] filelock_1.0.2                           
 [69] BiocFileCache_2.2.1                      
 [70] modelr_0.1.8                             
 [71] invgamma_1.1                             
 [72] cellranger_1.1.0                         
 [73] polyclip_1.10-0                          
 [74] tiff_0.1-11                              
 [75] Matrix_1.4-0                             
 [76] aplot_0.1.4                              
 [77] ashr_2.2-54                              
 [78] chipseq_1.42.0                           
 [79] boot_1.3-28                              
 [80] reprex_2.0.1                             
 [81] GlobalOptions_0.1.2                      
 [82] pheatmap_1.0.12                          
 [83] png_0.1-7                                
 [84] viridisLite_0.4.0                        
 [85] rjson_0.2.21                             
 [86] bitops_1.0-7                             
 [87] R.oo_1.24.0                              
 [88] KernSmooth_2.23-20                       
 [89] Biostrings_2.62.0                        
 [90] blob_1.2.3                               
 [91] shape_1.4.6                              
 [92] mixsqp_0.3-43                            
 [93] SQUAREM_2021.1                           
 [94] qvalue_2.26.0                            
 [95] ShortRead_1.52.0                         
 [96] jpeg_0.1-9                               
 [97] gridGraphics_0.5-1                       
 [98] TxDb.Mmusculus.UCSC.mm9.knownGene_3.2.2  
 [99] scales_1.2.0                             
[100] memoise_2.0.1                            
[101] magrittr_2.0.3                           
[102] plyr_1.8.7                               
[103] gplots_3.1.3                             
[104] zlibbioc_1.40.0                          
[105] compiler_4.1.2                           
[106] scatterpie_0.1.7                         
[107] TxDb.Hsapiens.UCSC.hg38.knownGene_3.13.0 
[108] BiocIO_1.4.0                             
[109] bbmle_1.0.24                             
[110] plotrix_3.8-2                            
[111] clue_0.3-60                              
[112] DESeq2_1.34.0                            
[113] Rsamtools_2.10.0                         
[114] cli_3.3.0                                
[115] systemPipeR_2.0.8                        
[116] XVector_0.34.0                           
[117] patchwork_1.1.1                          
[118] MASS_7.3-57                              
[119] tidyselect_1.1.2                         
[120] stringi_1.7.6                            
[121] emdbook_1.3.12                           
[122] yaml_2.3.5                               
[123] GOSemSim_2.20.0                          
[124] locfit_1.5-9.5                           
[125] latticeExtra_0.6-29                      
[126] ggrepel_0.9.1                            
[127] grid_4.1.2                               
[128] fastmatch_1.1-3                          
[129] tools_4.1.2                              
[130] circlize_0.4.14                          
[131] rstudioapi_0.13                          
[132] foreach_1.5.2                            
[133] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2  
[134] metapod_1.2.0                            
[135] gridExtra_2.3                            
[136] farver_2.1.0                             
[137] ggraph_2.0.5                             
[138] digest_0.6.29                            
[139] Rcpp_1.0.8.3                             
[140] broom_0.8.0                              
[141] org.Hs.eg.db_3.14.0                      
[142] httr_1.4.3                               
[143] AnnotationDbi_1.56.2                     
[144] ComplexHeatmap_2.11.1                    
[145] colorspace_2.0-3                         
[146] rvest_1.0.2                              
[147] XML_3.99-0.9                             
[148] fs_1.5.2                                 
[149] truncnorm_1.0-8                          
[150] splines_4.1.2                            
[151] yulab.utils_0.0.4                        
[152] tidytree_0.3.9                           
[153] graphlayouts_0.8.0                       
[154] ggplotify_0.1.0                          
[155] xtable_1.8-4                             
[156] jsonlite_1.8.0                           
[157] ggtree_3.2.1                             
[158] soGGi_1.24.1                             
[159] tidygraph_1.2.1                          
[160] ggfun_0.0.6                              
[161] R6_2.5.1                                 
[162] pillar_1.7.0                             
[163] htmltools_0.5.2                          
[164] glue_1.6.0                               
[165] fastmap_1.1.0                            
[166] codetools_0.2-18                         
[167] ChIPseeker_1.28.3                        
[168] fgsea_1.20.0                             
[169] GreyListChIP_1.26.0                      
[170] mvtnorm_1.1-3                            
[171] utf8_1.2.2                               
[172] lattice_0.20-45                          
[173] numDeriv_2016.8-1.1                      
[174] curl_4.3.2                               
[175] gtools_3.9.2                             
[176] GO.db_3.14.0                             
[177] survival_3.3-1                           
[178] limma_3.50.0                             
[179] munsell_0.5.0                            
[180] DO.db_2.9                                
[181] GetoptLong_1.0.5                         
[182] GenomeInfoDbData_1.2.7                   
[183] iterators_1.0.14                         
[184] haven_2.5.0                              
[185] reshape2_1.4.4                           
[186] gtable_0.3.0
DiffBind • 769 views
ADD COMMENT
0
Entering edit mode
10 months ago
Rory Stark ★ 2.0k

Assuming that your spike-ins are good, there are a number of reasons you could be seeing this. Often it is a sign of an underpowered experiment with only a few replicates that have wide variance, but I see that you have 6+ replicates for your sample groups, and each of the sample groups is able to generate good results in other contrasts, so I'll assume this is not the case here.

I'd consider two possible explanations, one biological and one technical.

Biologically, if most of the effect is present in the HET samples and not much is added in the full KO, you could see this pattern. In that case we'd expect a high overlap between the WT-KO and HET-KO differentially bound sites; you can check these using dba.plotVenn().

However you are seeing pileups in the browser that you believe indicates something different. First I'd check the read count levels for specific sites you think are not being identified to see why they are getting higher FDR values. You can do this by setting bCounts=TRUE when calling dba.report(), finding the sites of interest, and seeing if the read counts reported are consistent with what you see in the browser. By default, you will get "normalized" read counts; it may be that the normalization is altering the counts to mask the signal. You can also set bNormalized=FALSE to retrieve the raw read counts. Another way to check what the normalization is doing is to compare normalized and non-normalized MA plots:

par(mfrow=c(2,1))
dba.plotMA(dbObj, condition=1, bNormalized=FALSE)
dba.plotMA(dbObj, condition=1, bNormalized=TRUE)

You can compare the effect of normalization with what you see in the other two contrasts to see if something unexpected is going in just this particular contrast.

ADD COMMENT

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6