Question

Extract nth column from file and drop everything after dot

0

Entering edit mode

9 months ago

BATMAN • 0

hello, how are you?

I have several table files with the following information, separated by tabs:

GH5_8   Bacteria    Actinoalloteichus fjordicus ADI127-7    APU14662.1
GH5_8   Bacteria    Actinoalloteichus hoggarensis DSM 45943 ASO20105.1
GH5_8   Bacteria    Actinoalloteichus sp. AHMU CJ021    AUS77477.1
GH5_8   Bacteria    Actinoalloteichus sp. GBA129-24 APU20630.1
GH5_8   Bacteria    Actinobacteria bacterium YIM 96077  AYY15149.1
GH5_8   Bacteria    Actinokineospora sp. UTMC 2448  UVS80063.1

and I would like to extract the access codes only (fourth column) and store them in a list XXXX as follows:

APU14662
ASO20105
AUS77477
APU20630
AYY15149
UVS80063

How can I set up the command? Thanks

linux grep • 644 views

ADD COMMENT • link updated 9 months ago by Ram 43k • written 9 months ago by BATMAN • 0

0

Entering edit mode

since you tagged this post with several tools "gep cut awk sed" , what have you tried so far ?

ADD REPLY • link 9 months ago by Pierre Lindenbaum 161k

0

Entering edit mode

What is the separator between the columns, tab?

ADD REPLY • link 9 months ago by GenoMax 141k

0

Entering edit mode

yes, the separator is tabulation

ADD REPLY • link 9 months ago by BATMAN • 0

score 2 · Answer 1 · 2023-07-12

2

Entering edit mode

9 months ago

GenoMax 141k

Then

$ cat file | cut -f4 -d$'\t' 
UVS80063.1

should work

ADD COMMENT • link 9 months ago by GenoMax 141k

0

Entering edit mode

cut defaults to tab so you could leave out the delimiter argument. (On the other hand I had no idea bash had this ANSI-C quoting feature so I'm glad you included that!)

ADD REPLY • link 9 months ago by Jesse ▴ 740

0

Entering edit mode

thanks, how can I get the ".1" after that?

ADD REPLY • link 9 months ago by BATMAN • 0

4

Entering edit mode

If you want to remove the stuff including and after the period do:

$ cat file | cut -f4 | cut -f1 -d '.'
UVS80063

ADD REPLY • link 9 months ago by GenoMax 141k