Question: i cant sort bed file. Error in bedtools merge.
0
gravatar for unique379
3.0 years ago by
unique37960
Spain
unique37960 wrote:

Dear all,

I have bed file and try to do some work using bedtools merge. To do this first I have sorted my bed file and then used bedtools merge. But got error..

Error: Sorted input specified, but the file ......

So, I open my file to see possible error then I realized that used sort -k 1,1 -k 2,2n command does not worked properly.... I mean some line has sorted and some of them are not...

chr1 149928394 149928450 chr1_163309 2 + 5
chr1 165786346 165786432 chr1_186360 2 - 30
chr11 66240033 66240092 chr11_24483 1 + 3
chr11 100970875 100970934 chr11_26610 1 + 2
chr11 122156196 122156259 chr11_38105 1 - 5
chr1 168141086 168141168 chr1_186480 3 - 747
chr1 182391712 182391791 chr1_187506 2 - 8
chr12 1805175 1805235 chr12_39518 1 + 3
chr12 21518670 21518726 chr12_41343 1 + 4
chr12 24743093 24743136 chr12_51637 1 - 9
chr12 30513872 30513960 chr12_52133 2 - 64
chr1 231586879 231586960 chr1_170106 2 + 19
chr1 242310594 242310652 chr1_192505 16 - 33
chr12 9025545 9025632 chr12_40322 8 + 19
chr12 52416011 52416060 chr12_53993 1 - 9
chr12 93282380 93282488 chr12_57473 16 - 29
chr12 94834227 94834317 chr12_47071 2 + 53
chr12 95308393 95308445 chr12_47159 3 + 12
chr13 20880735 20880793 chr13_65648 10 - 25
chr13 21148757 21148795 chr13_61611 1 + 339

Please help.

Note1: sort version (GNU coreutils) 8.22, OS = centOS

Note2: Same file I used in different system RedHat with sort version 5.97, its work perfectly. No problem was found by bedtools.

Thanks

next-gen bedtools • 2.0k views
ADD COMMENTlink modified 4 months ago by RamRS20k • written 3.0 years ago by unique37960

Try also sort -k1,1 -k2,2n -k3,3n. But anyway the lines you post do not seem to be the output of sort, regardless of GNU version and OS. I suspect there is something missing here, maybe you are looking at the file before sorting or the column delimiter is not tab... Maybe have a look at the file with cat -vet file.bed, you should see tab as ^I and end-of-lines as $.

ADD REPLYlink modified 5 months ago by RamRS20k • written 3.0 years ago by dariober9.9k

Thanks but its sorted bed.. As I said and I checked by cat -vet and found tab as ^I and end-of-lines as $.

By the way this sort -k1,1 -k2,2n -k3,3n also does not work for me.

ADD REPLYlink modified 5 months ago by RamRS20k • written 3.0 years ago by unique37960

If you already use the Bedtools, why not using its sortBed function?

sortBed -i infile.bed > outfile.bed

Anyway, your example looks a bit suspicious: it has 7 columns, but usually it should have 6 or 12 (see here).

ADD REPLYlink modified 5 months ago by RamRS20k • written 3.0 years ago by michael.ante3.0k
1
gravatar for Alex Reynolds
3.0 years ago by
Alex Reynolds27k
Seattle, WA USA
Alex Reynolds27k wrote:

Consider using BEDOPS sort-bed:

sort-bed in.bed > out.bed

Generally works faster than Unix sortat sorting BED files, and it supports arbitrary column numbers and other potential use cases that are problematic for other tools.

ADD COMMENTlink modified 5 months ago by RamRS20k • written 3.0 years ago by Alex Reynolds27k
0
gravatar for Pierre Lindenbaum
3.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

Try specifying LC_ALL and the delimiter .

LC_ALL=C sort -t '(insert-tab)' -k1,1 k2,2n input.bed > out.bed
ADD COMMENTlink modified 5 months ago by RamRS20k • written 3.0 years ago by Pierre Lindenbaum116k

Thanks Pierre for your hint....

I used

LC_ALL=C sort -t$'\t' -k1,1 -k2,2n

instead

LC_ALL=C sort -t '(insert-tab)' -k1,1 k2,2n

and this time worked.

ADD REPLYlink modified 5 months ago by RamRS20k • written 3.0 years ago by unique37960
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1232 users visited in the last hour