Question

Transcription Factor Enrichment

20

Entering edit mode

13.6 years ago

Dave Bridges ★ 1.4k

What is the recommended (hopefully free) tool for finding enrichment of transcription factor binding sites in a set of promoter sequences?

transcription transcript sequence enrichment • 16k views

ADD COMMENT • link updated 7.6 years ago by jin ▴ 80 • written 13.6 years ago by Dave Bridges ★ 1.4k

score 10 · Answer 1 · 2010-10-17

10

Entering edit mode

13.6 years ago

Alastair Kerr 5.3k

Check out the data and tools in the jasper database (Free)

ADD COMMENT • link 13.6 years ago by Alastair Kerr 5.3k

0

Entering edit mode

unfortunately not much for nematode

ADD REPLY • link 11.8 years ago by Assa Yeroslaviz ★ 1.8k

score 7 · Answer 2 · 2010-10-17

7

Entering edit mode

13.6 years ago

Mary 11k

Depends on what you want to do of course--but you might find some tools in the MEME suite that could help you: http://meme.sdsc.edu/meme/

ADD COMMENT • link 13.6 years ago by Mary 11k

score 6 · Answer 3 · 2010-10-18

A grad student here had this very question at the beginning of her thesis. Like others here, we used TRANSFAC motifs. I would do that again adding JASPAR to the mix. At that time, no tools were known to her. We found two important considerations:

What defines the "promoter" or "gene control region" in human? We settled on 5000 bp of upstream sequence + exon 1 + intron 1 (entire or up to first 1000 bp, can't recall). Why intron 1? Because many gene control elements are found here.
When looking for enrichment, how do you define your set of control genes? By size (given that we took exon 1 and intron 1 data)? By GO categories? By gene position (say the neighboring gene)? This was tough and your solution may be specific to the genes your examining or the question(s) you are after.

The student then ran MAPPER to identify the TRANSFAC motifs.

score 5 · Answer 4 · 2010-10-18

The PAINT promoter analysis tool is my personal favorite. It will take a list of genes, find the upstream regions automatically, pass them through the free version of TRANSFAC and then compare the enrichment to a background set of genes ... either user provided or from a built-in choice. Everything is quite automated and very customizable.

score 5 · Answer 5 · 2010-10-18

5

Entering edit mode

13.6 years ago

Ian 6.0k

You could try PSCAN http://159.149.109.9/pscan/ it uses TFBS from both TRANSFAC and JASPAR.

ADD COMMENT • link 13.6 years ago by Ian 6.0k

score 4 · Answer 6 · 2013-02-26

4

Entering edit mode

11.2 years ago

boczniak767 ▴ 850

For people working with plants the ELEMENT could be useful.
It searches not only for known motifs but also for enriched words.

ADD COMMENT • link 11.2 years ago by boczniak767 ▴ 850

0

Entering edit mode

I've found clover http://cagt.bu.edu/page/Clover_about quite usefull. But you have to provide matirces for TF-binding sites. It is also long not updated program.

ADD REPLY • link 9.6 years ago by boczniak767 ▴ 850

score 3 · Answer 7 · 2010-10-17

I would suggest you to may customize your favorite GO enrichment tool in a way that the background list of genes will only represent the TFs or genes with TF related terms and perform the enrichment calculation. I tried this one for a small analysis.

Other option is to use a published method like Modulator inference by network dynamics (MINDy) . Disclaimer: I have not tried MINDy myself.

score 3 · Answer 8 · 2010-10-18

3

Entering edit mode

13.6 years ago

Fiamh ▴ 220

I'd second MEME as a conservative approach. Try to see which patterns are stable over a range of promoter sizes, promoter subsets and cutoffs. Once you have those switch to TOMTOM (part of the MEME suite) to map it to JASPER or TRANSFAC matrices.

ADD COMMENT • link 13.6 years ago by Fiamh ▴ 220

score 3 · Answer 9 · 2010-10-18

3

Entering edit mode

13.6 years ago

Fiamh ▴ 220

That's actually one major difference between the various tools -- Dave, do you have a list of genes or promoter sequences? Many tools expect a list of genes because they have their own concept of what a promoter is. If you have CAGE or RNA-Seq data and would like to define which promoter you are interested in half of the existing systems won't be of use to you. Likewise, if you are working with a species not supported by the system you'd be out of luck.

ADD COMMENT • link 13.6 years ago by Fiamh ▴ 220

1

Entering edit mode

i have genes (but i used biomart to get my own promoter sequences)

ADD REPLY • link 13.6 years ago by Dave Bridges ★ 1.4k

1

Entering edit mode

In that case a number of systems such as CisRed won't make sense to use as they have their own, fixed definition of start sites and promoters. The second question would be just how many promoters do you have. If it's just a few you probably have to revert to systems that use phylogenetic footprinting to increase your chances of finding functional binding sites.

ADD REPLY • link 13.6 years ago by Fiamh ▴ 220

score 2 · Answer 10 · 2010-10-17

Maybe RSAT? It seems to have a fairly broad collection of useful motif and CRM building and scanning tools, although I haven't used them myself yet so I can't tell you anything much about them. Web site/services are free but I think you have to register by post(?!) to install tools locally. http://rsat.ulb.ac.be/rsat/

score 0 · Answer 11 · 2016-10-19

0

Entering edit mode

7.6 years ago

jin ▴ 80

You can find the enriched transcription factor binding sites in a set of promoter sequences for plants using the tool provided by PlantRegMap.

ADD COMMENT • link 7.6 years ago by jin ▴ 80