Is there any tool that can group the reads in a bed file format that are few bases close to each other.
For instance,
Chr1   1023080  1023114 XYZ 806 +
Chr1   1023081  1023115 XYZ 50  +
Chr1   1023083  1023117 ABC 3   +
Chr1   1023085  1023119 cbd 90 +
I would like to group if the reads are atleast 4 bases close to each other... and report the results as follows
Chr1  1023080   1023117  xyz 859 +
Chr1   1023085  1023119 cbd 90 +
Is it possible using bedtools of awk in a short script? The name column doesnt really matter.. Any directions would be most welcome.. Thanks!!
I don't follow your example. 1) why is the second record in your output distinct from the first? 2) why are the coordinates of the first output record not
Chr1:1023080-1023117Yes, the first record should be 1023117... that was a typo mistake! The last record in fle is 4 bases away, I would like to group only the reads that fall within four bases range...
Hope I answered your question!