Tool: DiffBind 3.0: Extensive updates in Bioconductor 3.12
gravatar for Rory Stark
4 weeks ago by
Rory Stark880
University of Cambridge, Cancer Research UK - Cambridge Institute
Rory Stark880 wrote:

As part of the latest Bioconductor release 3.12, DiffBind 3.0 includes extensive updates of which users should be aware. The purpose of these updates, reflecting user comments and requests, is to provide more power and control to users, while incorporating up-to-date methods and utilizing knowledge and experience gained in the 10 years since DiffBind was first written.

The main updates are in the areas of modelling, analysis, normalization, and blacklists, as follows:

Modelling: DiffBind now supports modelling using arbitrary design formulas, including multi-factor designs with any combination of metadata factors. Contrasts can be specified in a variety of ways to be evaluated against the design. All sample data in the experiment is incorporated into a single model against which contrasts are evaluated (previously, each contrast was handled in a separate model with only the samples directly involved in the contrast).

Analysis: More standardized usage of the underlying edgeR and DESeq2 packages is implemented. The global objects for these analyses can be easily extracted for fine-grained control over the analysis. A default analysis will be completed from any starting point, including only specifying a samplesheet, completing any of the loading, blacklist/greylisting, consensus, counting, modelling, and analyzing steps.

Normalization: Having identified normalization as a key step in a successful differential binding analysis, normalization options have been split out into a new interface function dba.normalize() to provide fine-grained control. Normalizing against background reads is supported (using functions from the csaw package), as well as support for offsets (e.g. loess fit), exogenous spike-ins, and "parallel factor" normalization. An extensive section on normalization has been added to the vignette examining the impact of the various normalization options.

Blacklists and Greylists: New interface function dba.blacklist() applies ENCODE blacklists by default if the (automatically detected) genome is supported. Greylists derived from experiment-specific controls are also supported, with automatic generation of greylists implemented using the GreyListChIP package.

There are many more changes beyond those listed here. Please see the NEWS file for a more detailed list of changes to functionality and default behavior, as well as the re-vamped vignette. The key changes to defaults will be added to the ?DiffBind3 help page and are listed at the end of this message.

I will be monitoring the support forum as usual to help any users encountering issues in using the new version.

Regards- Rory

Note on Backward compatibility: While efforts have been made to maintain backward compatibility for existing users' scripts and data objects, certain issues may arise. Existing scripts should still run but will use the updated methods unless dba.contrast()is called with design=FALSE. Data objects stored using will automatically be updated and run in backward-compatibility mode. See the help page ?DiffBind3 for more discussion of backward compatibility issues.

Changes in default settings: 1. blacklist is applied by default, if available, using automatic detection of reference genome.

  1. greylists are generated from controls and applied by default.
  2. minimum read counts are now 0 instead of being rounded up to 1 (this is now controllable).
  3. centering peaks around summits is now done by default using 401-bp wide peaks (recommend to use summits=100 for ATAC-seq).
  4. read counting is now performed by summarizeOverlaps() by default, with single-end/paired-end counting automatically detected.
  5. filtering is performed by default; consensus peaks where no peak has more than five reads in any sample are filtered.
  6. control read subtraction is now turned off by default if a greylist is present
  7. normalization is based on full library sizes by default for both edgeR and DESeq2analyses.
  8. score is set to normalized values by default.
ADD COMMENTlink modified 22 days ago • written 4 weeks ago by Rory Stark880

Hi Rory, This is an important update. I do need to ask how can we implement an DB_BLACKLIST_38? by now all my data are for hg38 and this important feature is unusable!

Thank you in advance

ADD REPLYlink written 22 days ago by theodore80

I noticed the omission this morning and checked in a fix earlier today, exporting DBA_BLACKLIST_HG38 as documented on the help page for dba.blacklist(). The fix will appear in the next update as DiffBind_3_0_2 in the next day or so.

In the current version, if you run with the default blacklist=TRUE, the correct reference genome should automatically be detected, or else you can specify blacklist="BSgenome.Hsapiens.UCSC.hg38" (which is the actual value of DBA_BLACKLIST_HG38).

ADD REPLYlink modified 22 days ago • written 22 days ago by Rory Stark880

Perfect, many thanks!

ADD REPLYlink written 22 days ago by theodore80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1972 users visited in the last hour