Question: Bedtools trouble with double digit chromosomes
gravatar for sebastianzeki0
5.5 years ago by
United Kingdom
sebastianzeki0180 wrote:

I'm trying to use the bedtools closest function to compare a couple of datasets. It works fine however when trying to compare ranges from chromosomes with double digits, it seems not be able to make the comparison. For example:


chr9    222268  30116164 chr9  29909450  29909600  0
chr9    222268  30116164 chr9  29926499  29926649  0
chr9    222268  30116164 chr9  30050824  30050974  0
chr10  214399  8156391     .        -1              -1             -1
chr15  19138465  20536973    .    -1             -1             -1
chr15  83671081 100021943    .   -1             -1             -1


Has anyone else had this problem and what can I do to fix it?

sequencing bedtools • 1.3k views
ADD COMMENTlink modified 5.5 years ago by Alex Reynolds30k • written 5.5 years ago by sebastianzeki0180
gravatar for Charles Plessy
5.5 years ago by
Charles Plessy2.7k
Charles Plessy2.7k wrote:

In the absence of a test case to reproduce your problem it is hard to answer, but my gut feeling is that the files that you are using as input are not sorted in the same way.  For instance, the order of chromosomes in one might be chr1, chr2, chr3, ..., chr10, chr11, ...  in one file and chr1, chr10, chr11, ... chr19, chr2, chr3, ... in another one. The solution is then to sort all files the same way, for instance with sort -k1,1 -k2,2n.

ADD COMMENTlink written 5.5 years ago by Charles Plessy2.7k
gravatar for Alex Reynolds
5.5 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

Your data are probably not sorted:

$ sort-bed unsorted_dataset.bed > sorted_dataset.bed
ADD COMMENTlink modified 9 months ago by RamRS28k • written 5.5 years ago by Alex Reynolds30k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 713 users visited in the last hour