Sorting the BEd File
1
0
Entering edit mode
4.4 years ago

Hi all

I was trying to sort BED file using the following command sort -k1,1 -k2,2n in.bed > in.sorted.bed The description of bedtools closest here says

bedtools closest requires that all input files are presorted data by chromosome and then by start position

However I am getting the outcome as follows:

  • chr22 38143248 38143734
  • chr2 238168767 238169127
  • chr22 38614588 38614867
  • chr2 239347857 239348510
  • chr22 39631244 39631333
  • chr22 39686679 39687117

I am unable to understand why chr2 coordinates are sorted between chr22. I tried understanding how the sort -k options works from here. But It's unclear to me whether what I understand is correct or not.

Is it like in -k1,1, k1 sorts the results by values present in 1st column (chr here) and ,1 uses first character i.e c . Please help me understand it. Also what does n does to the sorting.

genome gene next-gen • 997 views
ADD COMMENT
1
Entering edit mode

what is the output of tr "\t" "#" < in.bed | head ?

Also what does n does to the sorting.

http://man7.org/linux/man-pages/man1/sort.1.html

ADD REPLY
0
Entering edit mode
4.4 years ago

Hi @Pierre

I found my mistake just now. I was using 2 in place of 2n. And that made all the difference.

ADD COMMENT

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6