I'm a PhD student and i'm working with a viral transcription factor. My task is to find out where the transcription factor binds in the human genome and what it does there. The idea was that it would bind near promoters and activate human genes, since this is its job in the viral genome.
What i can see after ChIP-Seq is, that it seems to bind more or less everywhere. There are, according to MACS2 standard settings, more than 300.000 peaks, which are also visible if you have a look in IGV (See picture at the end of the post).
When i analyse the peaks with MEME/ DREME, i get binding motifs in up to 90 % of the peaks, which fits to the binding sites in the virus (the binding of the transcfription factor at these sites was confirmed via EMSAs (electro mobility shif assays)).
By checking in IGV i can neither see an exclusive association with transcriptional start sites, exons or something else.
The RNA-Seq revealed that upon expression of the viral transcription factor cellular genes are upregulated (this is also its job in the viral genome) as well as downregulated (new feature). After 6 h the strongest upregulation is about 10 fold, the strongest downregulation to 5 % of the original expression. There are only about 10 genes strongly up- and 10 genes stronly downregulated, which was kind of strange, since the complete cells is reprogrammed.
My question would now be, if anybody knows if there are more transcription factors with about 300.000 binding sites and if people just pick the ones which fit to their concepts the best? And maybe anybody has an idea what this transcription factor could do with that many binding sites? I have some ideas but don't want to write them right now, not to push the discussion in a special direction.
I'd be happy for any input!
By the way, maybe anybody knows how i can enter refSeqGenes to IGV and see the gene names instead of the NM numbers?
Thanks a lot, Alex
About the picture:
1st row: ChIP input
2 nd row: ChIP transcription factor
3th row: MACS2 peaks
4rd row: RNA-Seq 0 h
5th row: RNA-Seq 3 h
6th row: RNA-Seq 6 h
7th row: RefSeqGenes
8th row: knownGeneTSS (transcriptional start sites)