I have a Bed file with 5 columns where the 4th is a unique ID and the 5th is a geneID
I was trying to play with bedtools to cluster this bed file by gene ID and output a single line for each gene with the range (chr start end) of the region. Basically I want to cluster intervals. Example
chr1 10 1000 ID1 GeneID1 chr1 20 1300 ID2 GeneID1 chr1 1400 1600 ID3 GeneID1
I'm trying to get an output like
chr1 10 1600 GeneID1
Can anyone tell me if playing with bedtools is the best way of doing this or is it possible just by awk ? any idea ?