I have a lot of duplicate reads in my RNA-seq, which is great from my view. My question is is there any way to get the coordinates of these duplicate reads ( > 10 reads) from a BAM/SAM file? It will be perfect to be able to annotate these regions with overlapping or flanking genes and output the results in a bed format. Any suggestions are appreciated, but solutions using R packages are more preferred.
Here are some artificial example datasets (might be deleted after one year) to be tested.