DiffBind to call differential binding of Super-enhancers from ROSE
3
0
Entering edit mode
2.9 years ago
Researcher ▴ 90

Hi All, I want to use bed files having list of super-enhancers identified in each samples generated from ROSE and to check the their differential binding. Towards this I have few questions: Can I use DiffBind for this? Is it possible to use DiffBind to read bed files and looking at their respective bam files to calculate their differential binding?

Have anybody done this before? Any suggestion will be highly appreciated.

Thanks

Super-enhancers ROSE DiffBind chipseq • 1.8k views
3
Entering edit mode
2.9 years ago

Super enhancer are nothing but collection of closely spaced individual "peaks" (less than 12.5kb I guess) in linear genomic space. So if you want to do a differential analysis, you could just perform a differential binding of all individual peaks and show an enrichment of differential peaks being a part of super-enhancers (Fisher's exact test if a differential peak is a part of super enhancer or not). Other than that, I could not think of a way of doing it. You could consider all the linear genomic space that span a super enhancer, but it will be very noisy and include a lot of background reads.

1
Entering edit mode

Speaking from personal experience, this is the best way to go about it. Trying to use the SE boundaries directly will not yield the results you want.

4
Entering edit mode

I agree. In fact I would call peaks as normal, then take the peak summits and resize them to the average peak width which is typically < 1kb for H3K27ac. If windows overlap, merge and get a count matrix for the resulting genomic regions. Perform diff. binding as usual and then filter results for the SEs. Do not focus on SE as these make up only a small part of all peaks. Even though it has been shown that these regions are enriched near "important" genes in terms of cellular identity more recent ATAC-seq data also have shown that the actual open chromatin part of these SE stretches are distinct and of short size (if memory serves < 1kb) so it is questionable what these long stretches indeed are. I would always try to limit the peak sizes as much as possible to avoid large and inflated counts.

0
Entering edit mode

Hi ATpoint, Thank you so much such a nice description. Can you please again explain which summit option have you just mentioned? Is it from MACS2 or DiffBind as both the tools have an option with the same name "summit" and I am not sure which one will be helpful for the cause.

Thanks again!

0
Entering edit mode

I refer to macs2 and its either narrowPeak output (column 10) or the summit BED files. There is indeed a resizing option in DiffBind which you could use. Check its manual, I do not know the command by heart.

1
Entering edit mode
2.9 years ago
venu 7.0k

You can pass custom count matrix to diffBind (check reference manual, page 3: Construct a DBA object).

• For each sample make a bed file of super enhancers
• calculate read counts from all your BAM files (using deepTools multiBamSummary function)
• pass this to diffBind
0
Entering edit mode

Hi venu, I am a bit lost and seeking your help. Actually I have 10 bed files from 10 samples, each one has a different start and end coordinates based on their SEs. But I am confused about making a common bed file from all these together in order to generate the read count matrix.

Before using the

 "multiBamSummary BED-file –BED selection.bed –bamfiles file1.bam file2.bam -o results.npz"


what should I use to make the common bed file (selection.bed):

bedops –intersect or
bedops –everything or
bedops –partition or
bedops –merge or
bedtools -intersect


Can you please explain it in more detail and help?

Thanks

0
Entering edit mode

If you're using DiffBind, just set consensus=TRUE. It will derive a consensus peakset between all samples for which it will compare the signal between samples.

0
Entering edit mode

Hi Jared thank you for your reply. I just left a same question here , please have a look. I hope you meant the same?

I am really stuck with this and looking for a way out.

0
Entering edit mode

Yes, that looks fine, but you should really follow the advice in geek_y's answer/ATpoint's comment above. Comparing SEs in any quantitative fashion is pointless due to the size of the ranges involved. Looking at the constitutive peaks that compose them is a much more productive use of your time.

1
Entering edit mode
2.9 years ago
sim.j.baum ▴ 120

I think I know what you mean:
I used deeptools2 for that - and it is similar to figures published by Loven J. et al. 2013 in Cell I think - they show the difference of the average binding in different conditions at SE and normal enhancer.
If so you take A.) the BED SE file and get the median or mean length of all SE. B.) take the BED SE regions and run with that (deeptools2) computeMatrix with the option computeMatrix scale-regions -S <biwig file(s)> -R <bed SE regions> -b <media or mean size of your SE> (you need to convert your bam files to bigWig by bamCoverage for example C.) you could use the underlying matrix values for further quantitative assessments and D.) plot the values for example by plotHeatmap.
Hope that helps & best wishes

0
Entering edit mode

Thank you all, for your helpful suggestions. I must say and appreciate it is really a nice platform for a newbie like me to get such valuable insights.

Best