Is There A Standard Approach To Predict Transcription Factors For One Particular Gene?
1
7
Entering edit mode
12.5 years ago
Matis ▴ 70

Hello, I am working on a set of differentially expressed genes in a human cell line. So far, I did several analysis and reduced the set to a manageable size and the biologists did some further literature research and picked 3 genes of interest now.

It would be now helpful to find possible transcription factors for these genes and check back with their expression levels. Unfortunately, I'm not very firm with this kind of analysis and would be glad for any hint which tool or approach might be suitable for this task.

I had a look at the related topics in this forum but most of the tasks were either looking for the genes belonging to a transcription factor or enrichment analysis on TFs for a set of genes (DAVID). Or maybe I got something wrong and someone can give me good reading advice?

So far, I had a look at mentioned tools like FANTOM, pscan, MAPPER, pazar, OPOSSUM, TFM_EXPLORER and of course the databases (JASPAR, TRANSFAC) but the results I get from these tools are quite different (maybe due to using the wrong tool?).

transcription binding gene • 3.6k views
ADD COMMENT
3
Entering edit mode
12.5 years ago

First, the transcription factors (TFs) regulating these three genes must be expressed in the cell line whose data you've analyzed. That is obvious - to some, but overlooked by others. Thus, any predictions of TF binding sites (TFBS) by MAPPER, et al. (using TRANSFAC and JASPAR models) need to be filtered against TFs that make sense from your literature searches or expression data or some other knowledge base.

A second item to consider the nature of the TF-gene interaction. Some TFs operate with a co-factor and so both TFBS need to be predicted, typically neighboring one another, to give a plausible prediction. In other cases, it is a cooperative effect of the TFs binding to several sites that only when occupancy reaches a certain level that transcription is initiated. Results from MAPPER et al. will be quite different from one another because each tool uses the TFBS models in different ways. You could use them all and develop a scoring scheme to derive plausible results. You can also pick a negative control from this approach in order to satisfy that aspect of the experiment.

If you have a gene-TF pair that you expect or know is part of the same response that your three genes are, then use that gene's sequence as a control or bait to identify that known TF and its biding site(s). In other words, which tool can identify TFBSs in your positive control gene?

ADD COMMENT
0
Entering edit mode

Thanks for your quick help! I intended to filter the resulting TF list with the gene expression data I have and then the biologists could prioritize according to their biological insight and literature.

The problem in comparing MAPPER etc. is that one method predicts only 10 TFs while for example MAPPER suggests more than 1000 (if the search is done 5k upstream and 2k downstream).

For the control gene-TF pair, I am going to check back with my colleagues and then I'm going to compare the different tools by their perfomance on this pair.

ADD REPLY
0
Entering edit mode

Be sure to filter your MAPPER results only to those above a certain score and below a certain E value. MAPPER should provide guidelines here. Also, some of the programs return results for non-human TFs, or those not from mammals and these of course would be irrelevant.

ADD REPLY
0
Entering edit mode

Reporting back after some struggle with the MAPPER database today. I couldn't find on the web page or their paper a precice advice on how to set the thresholds for score and E-value or how to weigh them against each other so I used the default params and just decreased the E-value threshold a little. Still, I have a lot of results. As a next step, I intended to compare the TFs to my data and also to filter them for non-human TFs. Unfortunately mapping the TF names is not trivial since there are also no direct links to the standard databases...

ADD REPLY
0
Entering edit mode

...An example would be the found TF, NF-Y (T00150), which is also there named CP1 but I can't find either of them in UniProt oder just get a clear link to a database via google. Maybe this refers to TCP1 but I am sure there must be an easier way of mapping them?

ADD REPLY
0
Entering edit mode

...An example would be the found TF, NF-Y (T00150), which is also there named CP1 but I can't find either of them in UniProt or just get a clear link to a database via google. Maybe this refers to TCP1 but I am sure there must be an easier way of mapping them?

ADD REPLY

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6