Question: Differences between motif enrichment analysis and footprinting in the context of DNAse-seq/ATAC-seq
0
gravatar for nanoide
6 months ago by
nanoide30
nanoide30 wrote:

Hi all.

So I'm planning to work with some datasets coming from DNAse-seq and ATAC-seq. One of the most important steps of the analysis seems to be the prediction of binding sites for regulatory elements, called in some places motif enrichment analysis. My question is pretty basic, but I'm finding hard to understand these concepts. Is there any difference between motif enrichment analysis and ATAC-seq footprinting? I've seen the terms in different places and different softwares recommended, but I don't really get if they are refering to different analyses

Hope anyone has any comment. Thank you

ADD COMMENTlink modified 6 months ago by geek_y9.8k • written 6 months ago by nanoide30
3
gravatar for geek_y
6 months ago by
geek_y9.8k
Barcelona
geek_y9.8k wrote:

Motif enrichment analysis is checking if there are any over represented sequences through PWMs in the regions/peaks of your interest.

Foot-printing based on deep sequenced ATAC/DNAse data tells you if something is really bound to peaks of interest.

If you imagine a transcription factor is bound to some regions of DNA (peaks), when you do ATAC or DNAase experiment, as there is some bound TF, it will be difficult for the transposase or DNAase to cleave the protein bound regions of DNA which leaves "footprints" which can later be identified from the sequencing data as you would see a dip in those regions.

This is a random figure I pulled from google, which shows that there is a "dip" in DNAse signals when there is CTCF binding ( from ChIP-Seq data) suggesting that by using deep sequenced ATAC or DNAse, you can infer "true" binding events, not just presence/absence of sequence motifs which doesn't tell you if those TFs are really bound to DNA.

enter image description here

PS: You may need millions of reads (probably >500 million from single sample) to accurately infer "true" binding events and it works for TFs/proteins that have long residing times. For example, most of the times it works for CTCF.

A good review here.

ADD COMMENTlink modified 6 months ago • written 6 months ago by geek_y9.8k
1

Thank you so much for comprehensive response. It's been really useful. Thanks for the reference too. Appreciate your help

ADD REPLYlink written 6 months ago by nanoide30
1

I added a link to review

ADD REPLYlink written 6 months ago by geek_y9.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 743 users visited in the last hour