UCSC Genome Browser binning
1
1
Entering edit mode
8.3 years ago

I am reading The Human Genome Browser at UCSC by Kent, et al.

In Figure 7, they discuss a binning approach to search for elements which overlap or are contained in a given range:

From the paper:

Features are put in the smallest bin in which they fit. A feature covering the range indicated by line A would go in bin 1. Similarly, line B goes in bin 4 and line C in bin 20. When the browser needs to access features in a region, it must look in bins of all different sizes. To access all the features that overlapped or were enclosed by line A, the browser looks in bins 1, 2, 3, 7, 8, 9, 10, and 11. For B the browser looks in bins 1, 4, 14, 15, 16, 17. For C, the browser looks in bins 1, 5, and 20.

The bins are sized in the following manner: 128 kb, 1 Mb, 8 Mb, 64 Mb, and 512 Mb.

My guess is that references to each element/feature (not the element itself) are stored in the smallest bin that will contain it. If that's correct, is there any compression done to the list of references?

Further, how is binning done for arbitrary elements or search queries larger than the largest bin size? For example, let's say I provide input elements that were the extents of arbitrary chromosomes longer than 512 Mb, which would not be impossible for some organisms. How are searches then performed for elements of that type? Is there an ordered list of adjacent bin structures to span the minimum and maximum range over the entire input element set?

Are elements/features (element/feature references) retrieved in sort order? (Are bins queried in a certain order?)

computation algorithm binning ucsc • 2.9k views
ADD COMMENT
1
Entering edit mode
8.2 years ago

provide input elements that were the extents of arbitrary chromosomes longer than 512 Mb

as far as I know, there are two versions of the binning system: http://genomewiki.ucsc.edu/index.php/Bin_indexing_system (Extended implementation). I only saw the first "initial" implementation:

ADD COMMENT

Login before adding your answer.

Traffic: 1199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6