Removing text after last underscore on given column
2
1
Entering edit mode
3.6 years ago

Hey guys,

I have a tab-delimited file like this

NZ_CP007546.1_En_asburiae4561905        434     17636
NZ_CP007546.1_En_asburiae4561905        85823   93173
NZ_CP007546.1_En_asburiae4561905        178912  203202
NZ_CP007546.1_En_asburiae4561905        313008  317041

...

I want to remove text after the last underscore on the 1st column, so that I have

NZ_CP007546.1_En        434     17636
NZ_CP007546.1_En        85823   93173
NZ_CP007546.1_En        178912  203202
NZ_CP007546.1_En        313008  317041

I know that I can use sed to do that, but when I use sed -i 's/_[^_]*$//' it removes all the text in the same line, and my goal is to do that only for the 1st column. Thanks!

sequence • 1.4k views
ADD COMMENT
3
Entering edit mode
3.6 years ago
cat in.txt | rev | sed 's/\t[^_\t]*_/\t/'  | rev

EDIT:

sorry: much simplier is

sed 's/_[^_\t]*\t/_\t/' < input.txt
ADD COMMENT
1
Entering edit mode
3.6 years ago
JC 13k
perl -pe 's/(_\w+)_\w+/$1/' < in.txt
NZ_CP007546.1_En        434     17636 
NZ_CP007546.1_En        85823   93173
NZ_CP007546.1_En        178912  203202
NZ_CP007546.1_En        313008  317041
ADD COMMENT

Login before adding your answer.

Traffic: 2076 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6