In many chip-seq transcription factor binding site locations, I found a presence of both the forward and reverse motif in a binding site (from ENCODE data).
For example, for the GATA1 transcription factor, the following 100 length sequence was identified from ENCODE as a positive binding site. From the CIS-BP database, the forward strand motif is (roughly) GATA, and the reverse strand motif is TATC. In the sequence below, which is the forward strand, you can see that it contains both GATA and TATC.
I have noticed this for many positive sequences for many different TFs. In addition, I found about an equal number of forward and reverse motifs in all of the positive sites for many TFs. I was wondering if anyone had any insight as to why that might be?