Question: i cant sort bed file. Error in bedtools merge.
0
gravatar for unique379
2.8 years ago by
unique37950
Spain
unique37950 wrote:

Dear all,

I have bed file and try to do some work using bedtools merge. To do this first I have sorted my bed file and then used bedtools merge. But got error..

Error: Sorted input specified, but the file ......

So, I open my file to see possible error then I realized that used sort -k 1,1 -k 2,2n command does not worked properly.... I mean some line has sorted and some of them are not...

chr1 149928394 149928450 chr1_163309 2 + 5
chr1 165786346 165786432 chr1_186360 2 - 30
chr11 66240033 66240092 chr11_24483 1 + 3
chr11 100970875 100970934 chr11_26610 1 + 2
chr11 122156196 122156259 chr11_38105 1 - 5
chr1 168141086 168141168 chr1_186480 3 - 747
chr1 182391712 182391791 chr1_187506 2 - 8
chr12 1805175 1805235 chr12_39518 1 + 3
chr12 21518670 21518726 chr12_41343 1 + 4
chr12 24743093 24743136 chr12_51637 1 - 9
chr12 30513872 30513960 chr12_52133 2 - 64
chr1 231586879 231586960 chr1_170106 2 + 19
chr1 242310594 242310652 chr1_192505 16 - 33
chr12 9025545 9025632 chr12_40322 8 + 19
chr12 52416011 52416060 chr12_53993 1 - 9
chr12 93282380 93282488 chr12_57473 16 - 29
chr12 94834227 94834317 chr12_47071 2 + 53
chr12 95308393 95308445 chr12_47159 3 + 12
chr13 20880735 20880793 chr13_65648 10 - 25
chr13 21148757 21148795 chr13_61611 1 + 339

Please help.

Note1: sort version (GNU coreutils) 8.22, OS = centOS

Note2: Same file I used in different system RedHat with sort version 5.97, its work perfectly. No problem was found by bedtools.

Thanks

next-gen bedtools • 1.8k views
ADD COMMENTlink modified 11 weeks ago by RamRS19k • written 2.8 years ago by unique37950

Try also sort -k1,1 -k2,2n -k3,3n. But anyway the lines you post do not seem to be the output of sort, regardless of GNU version and OS. I suspect there is something missing here, maybe you are looking at the file before sorting or the column delimiter is not tab... Maybe have a look at the file with cat -vet file.bed, you should see tab as ^I and end-of-lines as $.

ADD REPLYlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by dariober9.7k

Thanks but its sorted bed.. As I said and I checked by cat -vet and found tab as ^I and end-of-lines as $.

By the way this sort -k1,1 -k2,2n -k3,3n also does not work for me.

ADD REPLYlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by unique37950

If you already use the Bedtools, why not using its sortBed function?

sortBed -i infile.bed > outfile.bed

Anyway, your example looks a bit suspicious: it has 7 columns, but usually it should have 6 or 12 (see here).

ADD REPLYlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by michael.ante2.8k
1
gravatar for Alex Reynolds
2.8 years ago by
Alex Reynolds26k
Seattle, WA USA
Alex Reynolds26k wrote:

Consider using BEDOPS sort-bed:

sort-bed in.bed > out.bed

Generally works faster than Unix sortat sorting BED files, and it supports arbitrary column numbers and other potential use cases that are problematic for other tools.

ADD COMMENTlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by Alex Reynolds26k
0
gravatar for Pierre Lindenbaum
2.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum115k wrote:

Try specifying LC_ALL and the delimiter .

LC_ALL=C sort -t '(insert-tab)' -k1,1 k2,2n input.bed > out.bed
ADD COMMENTlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by Pierre Lindenbaum115k

Thanks Pierre for your hint....

I used

LC_ALL=C sort -t$'\t' -k1,1 -k2,2n

instead

LC_ALL=C sort -t '(insert-tab)' -k1,1 k2,2n

and this time worked.

ADD REPLYlink modified 12 weeks ago by RamRS19k • written 2.8 years ago by unique37950
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1679 users visited in the last hour