How could I calculate if the co-occurring of two TFBSs is higher than one would expect by chance? And this in either all promoters (1000 bp) or even in the complete genome within 1000bp of one another. I thought about this statistical problem but I am not an expert in probability ... and got stuck.
So my first thought was to calculate the random chance that two k-mers with length n and m co-occurr within 1000bp. But since I am not an expert in probability I am not sure how to calculate this. Any suggestions here?
Then I thought that the random chance might be unsuitable, as motifs are not simple k-mers but motif letter-probability matrices (meme format) so randomising the matrices might be a better idea? Maybe the best way would be to randomly switch around the values of each row? Would this be an acceptable approach? This there a tool for something like this?
It even gets more complicated as I have for one TFs three matrices which are quite similar. Here, I am not sure how to handle this. Any suggestions here?
Any insight is HIGHLY appreciated! Thank you :)