How to sort files
3
0
Entering edit mode
2.5 years ago
Nelo ▴ 20

Hi

I have two files of IDs, both have same ID with different order

test1_ID                          test2_ID
 17547                             14568
 18643                             18643
 14568                             17547
 12407                             47984
 47984                             12407

I want to sort the test2_ID according to test1_ID using command line.

command sort • 904 views
ADD COMMENT
0
Entering edit mode
2.5 years ago

csvtk sort supports user-defined level.

-k, --keys strings keys (multiple values supported). sort type supported, "N" for natural order, "n" for number, "u" for user-defined order and "r" for reverse. e.g., "-k 1" or "-k A:r" or ""-k 1:nr -k 2" (default [1])

-L, --levels strings user-defined level file (one level per line, multiple values supported). format: <field>:<level-file>. e.g., "-k name:u -L name:level.txt"

csvtk sort -H -k 1:u -L 1:test1_ID.txt test2_ID.txt
ADD COMMENT
0
Entering edit mode
2.5 years ago
Zeng Jingyu ▴ 60

In R, I will do:

two files, a and b

a
  test1_ID value
1    17547     1
2    18643     2
3    14568     3
4    12407     4
5    47984     5
b
  test2_ID value
1    14568     2
2    18643     3
3    17547     4
4    47984     5
5    12407     6
b <- b[order(b$test2_ID,decreasing = T),]
a <- a[order(a$test1_ID,decreasing = T),]
ADD COMMENT
0
Entering edit mode
2.5 years ago
5heikki 11k

You could do it like this, but you lose the header:

join -1 2 -2 1 -t $'\t' -o 1.1,2.1 \
    <(awk 'BEGIN{OFS="\t"}{print NR,$1}' file1 | sort -t $'\t' -k2,2) \
    <(sort file2) \
    | sort -t $'\t' -k1,1g \
    | awk 'BEGIN{FS="\t"}{print $2}'

So here we add an index column to file1, then join by common values, and finally sort by the index column..

ADD COMMENT

Login before adding your answer.

Traffic: 2491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6