5.3 years ago by
Seattle, WA USA
BEDOPS offers a tool for this called
closest-features, which finds the nearest query element(s) to each of a set of reference elements. (In your use case, TF binding sites would be query elements, and your genes (say, TSSs) are your reference elements.)
It's very simple to use, and very fast, with a low memory profile. R and libraries often have a habit of loading everything into system memory, which can be a problem if you're working with large datasets.
To get your TFs ready, you can use the
bedops set operation tool to filter your transcription factor set for TF binding sites that overlap ChIP-seq peaks or other regions. Take a look at the
Then you might use
closest-features to look for the nearest ChIP-seq-peak-overlapping-TF to each member of your set of, for example, gene transcription start sites (COX-1, etc.).