I have a file that has enhancers in 1st column and the name of transcription factor in 2nd column for which it has binding sites. I wanted to find out which enhancers have binding sites for common transcription factors so I made a heatmap in R but since my data is so huge its impossible to estimate the no. of TFs shared by a group of enhancers. How can I accomplish this in R? My data looks like this:
Enhancer TF Gene1_Enhancer1 Arid3a Gene1_Enhancer1 Hoxa4 Gene1_Enhnacer1 Ascl2 Gene1_Enhancer1 EBP Gene1_Enhancer2 ETS1 Gene2_Enhancer1 ETS1 Gene2_Enhancer1 EBP Gene2_Enhancer1 Arid3a Gene2_Enhancer1 Hoxa4 Gene3_Enhancer1 Arid3a Gene3_Enhancer1 Hoxa4 Gene3_Enhancer1 EBP Gene3_Enhancer2 Hoxa7 Gene4_Enhancer1 Hoxa4 Gene4_Enhancer1 EBP Gene4_Enhancer1 Arid3a
Is there a way I could have my output like this in a text file such that I have groups containing 1 or more enhancer from all 4 genes:
Group Common TFs Gene1_Enhancer1, Gene2_Enhancer1, Arid3a, EBP, Hoxa4 Gene3_Enhancer1, Gene4_Enhancer1
Thanks a lot!!!
Thanks a lot. I tried this. It works well and finds the TFs common to all enhancers. I'm sorry I probably didn't make it clear. I have many enhancers from each gene like 45 say for each gene. I want to find groups of TFs that are present in groups of enhancers of all genes. For example apart from the above example there may be another group of enhancers within this huge set that shares entirely different TFs than this above group but nevertheless are similar to each other and so interesting for me. So I want to have all these different groups of enhancers with common TFs apart from TFs that are common to the entire set of enhancers which this function gives me. Is there any way to use this function for that? Thanks a lot!