7.0 years ago by
It is admittedly my own tool, but the
closest operation in bedtools will do what you want. The -d option will report the distance between the retrotransposon and the nearest gene. If they in fact overlap one another, the distance will be 0. My answer assumes that the "genes.bed" file includes the gene's strand. If it does, the strand will be reported in the output. Note that GFF is fine as well.
bedtools closest -a retro-inserts.bed -b genes.bed -d
Also, I just remembered that galaxy has a nice option in their "Operate on Genomic Intervals" section called "Fetch closest non-overlapping feature for every interval". This is an equally good option, though it looks like it doesn't report the distance between intervals. That said, once you have the coordinates, a little awk and the formula I mention in this thread is all you need to get the distance.