Extract nth column from file and drop everything after dot
1
0
Entering edit mode
9 months ago
BATMAN • 0

hello, how are you?

I have several table files with the following information, separated by tabs:

GH5_8   Bacteria    Actinoalloteichus fjordicus ADI127-7    APU14662.1
GH5_8   Bacteria    Actinoalloteichus hoggarensis DSM 45943 ASO20105.1
GH5_8   Bacteria    Actinoalloteichus sp. AHMU CJ021    AUS77477.1
GH5_8   Bacteria    Actinoalloteichus sp. GBA129-24 APU20630.1
GH5_8   Bacteria    Actinobacteria bacterium YIM 96077  AYY15149.1
GH5_8   Bacteria    Actinokineospora sp. UTMC 2448  UVS80063.1

and I would like to extract the access codes only (fourth column) and store them in a list XXXX as follows:

APU14662
ASO20105
AUS77477
APU20630
AYY15149
UVS80063

How can I set up the command? Thanks

linux grep • 644 views
ADD COMMENT
0
Entering edit mode

since you tagged this post with several tools "gep cut awk sed" , what have you tried so far ?

ADD REPLY
0
Entering edit mode

What is the separator between the columns, tab?

ADD REPLY
0
Entering edit mode

yes, the separator is tabulation

ADD REPLY
2
Entering edit mode
9 months ago
GenoMax 141k

Then

$ cat file | cut -f4 -d$'\t' 
UVS80063.1

should work

ADD COMMENT
0
Entering edit mode

cut defaults to tab so you could leave out the delimiter argument. (On the other hand I had no idea bash had this ANSI-C quoting feature so I'm glad you included that!)

ADD REPLY
0
Entering edit mode

thanks, how can I get the ".1" after that?

ADD REPLY
4
Entering edit mode

If you want to remove the stuff including and after the period do:

$ cat file | cut -f4 | cut -f1 -d '.'
UVS80063
ADD REPLY

Login before adding your answer.

Traffic: 2862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6