Tool:Sorting bed files with bash sort
0
0
Entering edit mode
8.7 years ago

I've been implementing this command for the past few days now and I wished I did it earlier.

It functions similar to sortBed in bedtools but uses the bash sort which can be used in pipes

Find your .bashrc file in your home directory

$ cd $HOME
$ vi .bashrc # vi ~/.bashrc should work fine from any directory

Add this to .bashrc

alias sortbed="sort -k1,1 -k2,2g "

Save and source your .bashrc to get it to work

$ source ~/.bashrc

Example of use

$ intersectBed -a in.bed -b /segDup_unmappable.bed -wao | sortbed |uniq | cut -f 1,2,3,8 >in_segDup_unmappable_overlap.txt

This intersects a bed file of chr, start, end to a list of segmental duplications and unmappable regions in hg19. It also pipes to bash commands to only remove the in positions and the number of base pairs overlapping it.

sortbed is used to sort the output and uniq is applied to return only unique lines. You can treat sortbed like sort. Just a nice shortcut I thought others might like.

bash bed • 5.2k views
ADD COMMENT
2
Entering edit mode

Like I said, bedtools sort gives the UNIX command to sort bed files; I don't see any advantage in your approach.

ADD REPLY
2
Entering edit mode

and set LC_ALL=C to make things faster.

ADD REPLY
2
Entering edit mode

BEDOPS sort-bed works faster at sorting BED files than GNU sort, and you can pipe data in and out via standard UNIX streams.

$ upstream-process ... | sort-bed - | downstream-process ...

Unlike other tools, it also handles arbitrary numbers of columns and can be assigned a chunk of memory, to sort very large BED files that will not otherwise fit into system memory.

ADD REPLY
0
Entering edit mode
-k1,1V -k2,2g

Add semantic version sort to the first key for the chromosomes to be sorted correctly.

ADD REPLY

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6