GenomicRanges: Limit overlaps to one occurrence
0
0
Entering edit mode
14 months ago
sorrymouse ▴ 120

I have a number of genomic ranges datasets with metadata. I want to create a large master set which contains information about ranges that overlap in each dataset, with gaps when there is no overlapping range. However, some datasets might have more than one range that overlaps. The problem is that then when you overlap the next dataset, it starts repeating the data. For example: Dataset 1:

chr2L   13550276    13551760

Dataset 2:

chr2L   13550975    13551760
chr2L   13550276    13550808

Dataset 3:

chr2L   13550975    13551734
chr2L   13550304    13550803

The behavior that both join and plyranges does is as follows:

chr2L   13550276    13551760    chr2L   13550975    13551760    chr2L   13550304    13550803
chr2L   13550276    13551760    chr2L   13550975    13551760    chr2L   13550975    13551734
chr2L   13550276    13551760    chr2L   13550276    13550808    chr2L   13550975    13551734
chr2L   13550276    13551760    chr2L   13550276    13550808    chr2L   13550304    13550803

I want to make it so once a range has been pulled out of the bag it canned be pulled again, so that the output looks more like this:

chr2L   13550276    13551760    chr2L   13550975    13551760    chr2L   13550975    13551734
chr2L   13550276    13551760    chr2L   13550276    13550808    chr2L   13550304    13550803

Another way of thinking about it would be if column 1 is the master overlap and the other columns do not need to be overlapped to each other, just to column 1.

Any ideas?

R GenomicRanges • 295 views
ADD COMMENT

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6