Question: The last time I played with transcription factor motif searching (many years ago) I used the TRANSFAC database of motifs combined with the MATCH tool at gene-regulation.com. Has the state of the art progressed? I'm particularly interested in web servers, web services, or R packages to do this analysis.
Background: A collaborator approached us wanting to study the transcriptional regulation of his favorite gene. Specifically, he wanted to identify key regulators that bind in the ~100 KB upstream region. Because he wanted to cast such a wide net, we started by providing the ~2 KB of sequence (spread among four regions) with the highest phylogenetic conservation as calculated in UCSC's conservation track.
Based on these results, he generated transgenic mice with various combinations of these conservons knocked out, and many of these mice had dramatically altered expression patterns. The next step of course is to identify the specific regulators binding to these regions.
He is currently making additional mice that refine the 2 KB of sequence into smaller chunks. In parallel, we'd like to use bioinformatics to identify candidate binding sites and their corresponding regulatory proteins.
Edit: To clarify, there are many tools that take many coregulated genes and find enriched motifs, but these tools do not really address my particular need. I'm interested in the regulation of exactly one gene, and I'm interested in identifying candidate binding motifs corresponding to known TFs in that gene's upstream genomic sequence.