Question: Bedtools trouble with double digit chromosomes
0
gravatar for sebastianzeki0
2.8 years ago by
United Kingdom
sebastianzeki0100 wrote:

I'm trying to use the bedtools closest function to compare a couple of datasets. It works fine however when trying to compare ranges from chromosomes with double digits, it seems not be able to make the comparison. For example:

 

chr9    222268  30116164 chr9  29909450  29909600  0
chr9    222268  30116164 chr9  29926499  29926649  0
chr9    222268  30116164 chr9  30050824  30050974  0
chr10  214399  8156391     .        -1              -1             -1
chr15  19138465  20536973    .    -1             -1             -1
chr15  83671081 100021943    .   -1             -1             -1

 

Has anyone else had this problem and what can I do to fix it?

sequencing bedtools • 816 views
ADD COMMENTlink modified 2.8 years ago by Alex Reynolds21k • written 2.8 years ago by sebastianzeki0100
2
gravatar for Charles Plessy
2.8 years ago by
Charles Plessy2.3k
Japan
Charles Plessy2.3k wrote:

In the absence of a test case to reproduce your problem it is hard to answer, but my gut feeling is that the files that you are using as input are not sorted in the same way.  For instance, the order of chromosomes in one might be chr1, chr2, chr3, ..., chr10, chr11, ...  in one file and chr1, chr10, chr11, ... chr19, chr2, chr3, ... in another one. The solution is then to sort all files the same way, for instance with sort -k1,1 -k2,2n.

ADD COMMENTlink written 2.8 years ago by Charles Plessy2.3k
0
gravatar for Alex Reynolds
2.8 years ago by
Alex Reynolds21k
Seattle, WA USA
Alex Reynolds21k wrote:

Your data are probably not sorted:

$ sort-bed unsorted_dataset.bed > sorted_dataset.bed
ADD COMMENTlink written 2.8 years ago by Alex Reynolds21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1357 users visited in the last hour