divide coordinates into equal number of bins
1
0
Entering edit mode
13 months ago
DARLOR • 0

Hello people, I need help

I have some coordinates:

coords
# A tibble: 6 × 3
  Chr      Start      End
  <chr>    <int>    <int>
1 chr12 73041827 73046712
2 chr15 93499114 93595891
3 chr7  51879145 51994458
4 chr13 88821472 89323225
5 chr19 14448072 14597983
6 chr1  21398403 21961942

And as you see, they have different length. So I would to divide it into equal number of bins (e.g. 200). Should I use deeptools. I appreciate some suggestions.

r granges • 750 views
ADD COMMENT
0
Entering edit mode

What have you tried? Did a Google search for "bin granges" reveal anything useful?

ADD REPLY
0
Entering edit mode

No, I didn't find anything useful for me

ADD REPLY
0
Entering edit mode

That is impossible. I'm pretty sure the first results page would contain the tile function mentioned by benformatics below. Invest a decent amount of effort before asking others to put in work for you.

EDIT: My first Google result is to the link https://support.bioconductor.org/p/76875/ which has the answer you need. You invested zero effort. That is just lazy.

ADD REPLY
0
Entering edit mode

Sorry man, I found your first google result, and I think it is not what I was searching for. Of course I'm searching in internet before answer there. You cannot judge, I don't want that anyone does the work for me. What are you doing here, what's your job? judging other people? Maybe I did the question wrong, so apologize for this. Goodbye

ADD REPLY
0
Entering edit mode

I think it is not what I was searching for

It is, though. Your wording is unclear, you want to either tile each range into 200 bins or 200bp sized bins. Both can be achieved through tile, whose documentation states:

tile partitions each range into a set of tiles, which are defined in terms of their number or width.

The proper etiquette is to mention things you've tried when you ask the question. My job is to ensure questions follow an acceptable standard and the forum is being used properly, and when I spot a possible problem, I nudge things in the right direction. Again, invest effort both in trying to figure out the solution yourself AND in asking the question here.

ADD REPLY
0
Entering edit mode

Ok, thank you. I have to check better

ADD REPLY
0
Entering edit mode
13 months ago

First you need to convert to GRanges.

gr <- paste0(coords$Chr,':',coords$Start,'-',coords$End)

This should be built in to the standard GRanges package:

http://web.mit.edu/~r/current/arch/i386_linux26/lib/R/library/GenomicRanges/html/tile-methods.html

gr <- GRanges(
        seqnames=Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
        ranges=IRanges(1:10, end=11),
        strand=Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
        seqlengths=c(chr1=11, chr2=12, chr3=13))

# split every range in half
tiles <- tile(gr, n = 2L)
stopifnot(all(elementNROWS(tiles) == 2L))

# split ranges into subranges of width 2
# odd width ranges must contain one subrange of width 1
tiles <- tile(gr, width = 2L)
stopifnot(all(all(width(tiles) %in% c(1L, 2L))))

windows <- slidingWindows(gr, width=3L, step=2L)
width(windows[[1L]]) # last range is truncated
ADD COMMENT

Login before adding your answer.

Traffic: 2191 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6