How to sort numbers in two columns in Ascending Order
2
0
Entering edit mode
4.7 years ago
Kumar ▴ 170

I have a large file with un-sorted order numbers in column 2 and 3. I need to sort these numbers in Ascending Order. Please see example file below and suggest.

File:

NODE_55_length_30858_cov_27.421 19951 19901
NODE_55_length_30858_cov_27.421 19930 19900
NODE_55_length_30858_cov_27.421 17578 17613
NODE_55_length_30858_cov_27.421 16544 16578
NODE_55_length_30858_cov_27.421 19982 19932

OUTPUT:

NODE_55_length_30858_cov_27.421 19901 19951
NODE_55_length_30858_cov_27.421 19900 19930
NODE_55_length_30858_cov_27.421 17578 17613
NODE_55_length_30858_cov_27.421 16544 16578
NODE_55_length_30858_cov_27.421 19932 19982

gene sequencing • 956 views
ADD COMMENT
1
Entering edit mode

With thanks to two members who answered, I'd like to hear opinions about the original question. It strikes me as a column-sorting question which is outside of the scope.

Does the fact the poster is interested in sorting the names of assembly contigs make it within the scope? I see this pop up occasionally where non-bioinformatics questions are answered because they are tangentially related to bioinformatics topics, while other questions are quickly shut down.

ADD REPLY
0
Entering edit mode

in my practical bioinformatic work (as customer service) this is a very frequent task. This allows at least two conclusions

  1. Some bioinformatics specialists in customer service jobs are overpaid column sorting experts
  2. Complex column sorting is a non trivial task for beginners.

For narcissistic reasons I'd largely favor the second conclusion, but I admit complex sorting was way less trivial when I needed a Schwartzian transform (years ago in Perl) for things that python for example can do naturally.

ADD REPLY
4
Entering edit mode
4.7 years ago
JC 13k

Using perl-one-liner:

$ perl -lane '($F[1]>$F[2]) ? print "$F[0]\t$F[2]\t$F[1]" : print $_' < in
NODE_55_length_30858_cov_27.421 19901   19951
NODE_55_length_30858_cov_27.421 19900   19930
NODE_55_length_30858_cov_27.421 17578   17613
NODE_55_length_30858_cov_27.421 16544   16578
NODE_55_length_30858_cov_27.421 19932   19982
ADD COMMENT
3
Entering edit mode
4.7 years ago
shawn.w.foley ★ 1.3k

You can do something like that with awk.

awk 'BEGIN{OFS="\t";FS="\t"} {if ($2 < $3) print $0; else print $1,$3,$2}' file.txt

This defines the field seperator (FS) and output field separator (OFS) as tabs, then states that if column 2 < column 3 print the line, else print column 1, column 3, column 2.

Input:

NODE_55_length_30858_cov_27.421 19951   19901
NODE_55_length_30858_cov_27.421 19930   19900
NODE_55_length_30858_cov_27.421 17578   17613
NODE_55_length_30858_cov_27.421 16544   16578
NODE_55_length_30858_cov_27.421 19982   19932

Output:

NODE_55_length_30858_cov_27.421 19901   19951
NODE_55_length_30858_cov_27.421 19900   19930
NODE_55_length_30858_cov_27.421 17578   17613
NODE_55_length_30858_cov_27.421 16544   16578
NODE_55_length_30858_cov_27.421 19932   19982
ADD COMMENT
0
Entering edit mode

Thanks for helping..

ADD REPLY

Login before adding your answer.

Traffic: 2134 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6