Question: Diffbind - Very high proportion of significant peaks
0
gravatar for kautilya
20 months ago by
kautilya400
United States
kautilya400 wrote:

I am using Diffbind to call differential peaks on an ATAC seq dataset of condition A vs B each having 3 replicates.

The total number of peaks post union is - 106700 and of these 33197 (~30%) come out as significant even with a stringent FDR cutoff of 0.01.

Is this high proportion of significant peaks expected? I do understand it depends on the data but still the number seems very high. Should any additional steps should be added to the protocol below to increase the specificity?

The protocol I followed was as follows:-

  • Peaks were called for replicate individually with macs2 using the -format BAMPE to use the fragment size information in BAM for extension(reads are paired-end). The p-value cutoff here was kept lenient at 0.01 to allow more peaks early on.

  • DESeq2 was then used within Diffbind for normalization and finding the differential peaks.

  • The--summits option was not used in Diffbind as size is not a particular concern.

Thanks!.

atac-seq deseq2 diffbind • 1.4k views
ADD COMMENTlink modified 19 months ago • written 20 months ago by kautilya400
1

Some details about the experiment itself, so what are the conditions, which organism, which treatment etc., would help.

ADD REPLYlink written 20 months ago by ATpoint26k

Thanks for the reply!. The experiment is a gene knockout experiment and the organism here is Mus musculus(using mm10 as reference). There is no input/control. It is comparison of wild type vs the knockout each having 3 replicates.

ADD REPLYlink modified 20 months ago • written 20 months ago by kautilya400
1

Which gene was knocked out? Depending on the gene function, e.g. a histone deacetylase or methyltransferase, it might make sense that you see large-scale effects on chromatin accessability. Did the quality control steps look ok, so do you see the nucleosomal pattern in the insert sizes of the paired-end sequencing, and do the replicates correlate well among each other?

ADD REPLYlink written 20 months ago by ATpoint26k

Thanks again. Apologies was out of office.

  • The gene which was knocked out is the part of an nucleosome remodeling complex so i think the large-scale effects can be expected.
  • The insert sizes of at least 4 of 6 samples shows a clear increase around 170-200 bp and another small one at ~400bp. This corroborates with excepted mono/di-nucleosomal peaks. Example
  • Also the replicate concordance using IDR is around 30% if directly compared and ~65% when a pooled peak list for a condition is used, at a cutoff of 0.01. I am still unclear on what are the expected ranges for good IDR scores though.
ADD REPLYlink modified 19 months ago • written 19 months ago by kautilya400
1

If your aim is to call different open regions I'd first be concerned about using BAMPE when you call peaks. When you use BAMPE, MACS2 will create a coverage vector from the fragment (created by filling in the space between the two mapped ends). However, pairs can be mapped either side of a nucleosome so when you pileup the fragment, you're actually calling peaks in both closed and open regions.

ADD REPLYlink written 20 months ago by James Ashmore2.7k

Thanks for the reply! I am new to peak based analysis in general. I chose BAMPE over a specific shift/extsize based on the MACS2 authors recommendation here

The other parameters that I have seen being used for ATAC are --shift -75/-100 extsize 150/200. But will these not then result in further shifting the reads which are mapped on either side of a nucleosome even more away from the centre and going into closed regions nearby? I realize it may be a naive question but I have not been able to get a good visualization of the part of the nucleosome the reads are coming from in ATAC.

Also what parameters would you recommend ideally for a differential ATAC analysis.

ADD REPLYlink written 19 months ago by kautilya400
4
gravatar for Rory Stark
20 months ago by
Rory Stark550
University of Cambridge, Cancer Research UK - Cambridge Institute
Rory Stark550 wrote:

I have certainly see this level of differentially Some questions:

  • Do you expect the conditions to change the open chromatin landscape quite a bit?

  • How are the differential sites divided between the two conditions? Do they mostly show increases in one condition, or are there many changes in both directions?

It may help to look at the MA plots using dba.plotMA(). You can also compare with bNormalized=FALSE to see if the impact of nomalization. If the MA plots show a string shift in one direction, that usually indicates that one of the conditions has a big impact on chromatin state.

You can also look at some of the sites in more detail to see if they are convincingly different. A couple ways to do this:

  • Get the report with dba.report() and set bCounts=TRUE to see the fold change and the distribution of read counts.
  • Look at all six conditions in a browser (eg. IGV). Check some of the differential sites to see what the pileups look like.

Cheers-

Rory

ADD COMMENTlink written 20 months ago by Rory Stark550

Thanks for replying!.

  • Do you expect the conditions to change the open chromatin landscape quite a bit?

    Yes, this is a gene knockout experiment and the gene is part of a nucleosome remodelling complex.

  • How are the differential sites divided between the two conditions? Do they mostly show increases in one condition, or are there many changes in both directions?

    Actually this part had a bit of perplexing results. I tried using both deseq2 and edgeR for the differential peak calling. However the results from both of them differ highly(ma plots - deseq2/edgeR/no normalization). With bNormalized=FALSE the data does show a strong shift in one direction.

Are results from deseq2 and edgeR expected to be so different?

I did take a look at the raw counts and visualized the results in IGV. The pileups do look visually different across the replicates even for peaks at a lower FDR cutoff.

ADD REPLYlink modified 19 months ago • written 19 months ago by kautilya400
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1821 users visited in the last hour