Question

Merging And Binning Genomic Segments

6

Entering edit mode

13.4 years ago

Ryan D ★ 3.4k

I asked a form of this question previously here http://biostar.stackexchange.com/questions/3676/merging-genomic-segments-separated-by-some-distance-number-of-markers but did not get much response.

Given genomic regions (CNVs in my case) of some size or number of markers, how can I use existing resources to 1. merge them if they are near enough (by size or number of markers) and 2. bin them if they are close enough to one another (though not identical).

Some examples of my problem (the binning/merging problem) are given here: http://dl.dropbox.com/u/9445847/help.ppt

For more on the data format, see the earlier post.

Thanks, Rx.

cnv merge genomics • 3.1k views

ADD COMMENT • link updated 13.3 years ago by brentp 24k • written 13.4 years ago by Ryan D ★ 3.4k

Ram · Answer 1 · 2010-12-07

There is a very nice implementation of a Cluster Tree in bx-python (available on pypi)

the description in the file is:

Provides a ClusterTree data structure that supports efficient finding of clusters of intervals that are within a certain distance apart.

and you can see examples in the tests.

Another option is to use the BigBed tools from UCSC (see this paper). From there, you can write a script to query the binary format in windows and generate your binned data.

score 1 · Answer 2 · 2010-12-07

1

Entering edit mode

13.4 years ago

Chris Miller 22k

A common way to look at this sort of problem is to use the Minimal Common Region (MCR). Other, more nuanced ways of finding recurrent copy-number alterations have been done with tools like GISTIC, RAE, and RTS (which I can't find a link for at the moment).

ADD COMMENT • link 13.4 years ago by Chris Miller 22k