I have a bed file of enhancer sites that i'd like to run motif analysis on. I'm looking for core promoter elements (if any exist) for regions such as TATA-box, Sp1, Inf, etc.
I came across MEME, and while I admittedly haven't read the entirety of the manual (i'm working on it though!) I thought it would be a good idea to come here and ask for any common pitfalls for this type of analysis.
Specifically, i'm looking for advice to make this analysis statistically and biologically sound. Are the input files to MEME suite my bed file of enhancer sites, or should I first convert this bed file to fasta? Which of the MEME suite tools should I be using if my enhancer sites vary from no less than 20bp to no larger than 1000bp? What is the difference between MEME's novel, ungapped motif identifier and GLAM2's noval, gapped motif identifier? Which one would be better suited to this type of analysis?